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GenCore version 4.5 



OM nucleic 
Run on : 



Copyright (c) 1993 - 2000 Compugen Ltd. 

nucleic search, usinj ^w^nodel 
February 7, 2002, 08:21:14 



5 (/vuXV ^ u^^aa^^ 



Title: 

Perfect score: 
Sequence : 

Scoring table : 



Searched: 



Search time 3842.15 Seconds 
(without alignments) 
1807.663 Million cell updates/sec 



US-09-394-745-5893 
421 

1 gaaaaaaataactcggaaaa ccatgttggttcctgcatgc 421 

IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 



1472140 seqs, 8248589755 residues 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



2944280 



Database 



GenEmbl : * 



1 




gb_ba : * 




2 




gb_htg: 


* 


3 




gb in:* 




4 




gb om : * 




5 




gb ov : * 
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gb pat: 




7 




gb_ph : * 




8 




gb_pl : * 
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gb_pr : * 




10 


gb_ro : 




11 


gb sts 


: * 


12 


gb_sy : 




13 


gb un: 




14 


gb_vi : 




15 


em ba : 




16 


em fun 


: * 


17 


em hum: * 


U 




em in: 




19 


em om: 


* 


20 


em or : 




21 


em_ov : 


* 


22 


em_pat 




23 


em_ph : 


* 


24 


em pi : 


* 


25 


em ro : 


* 


2( 




em sts 




27 


em_sy : 


★ 



\ 



o o 


em 


un : 


A A 


em 


vi : * 


JU 


em 


htgo hum : * 


31 


em 


htgo inv:* 


32 


em_ 


htgo_rod: * 


33 


em_ 


htg_hum: * 


34 


em 


htg_inv : * 


35 


em_ 


_htg_rod: * 


36 


em 


htg other:* 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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20 


41.4 
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21 
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AC022780 Mus muscu 
AC079834 Homo sapi 
AC005386 citb_57_l 
AC022327 Mus muscu 
AC074310 Mus muscu 

AC005302 Mus muscu 
AC084060 Mus muscu 



ALIGNMENTS 



RESULT 1 

ATAC009606/C 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 
JOURNAL 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



ATAC009606 91924 bp DNA PLN 25-JAN-2001 

Arabidopsis thaliana chromosome III BAC F22F7 genomic sequence, 
complete sequence. 
AC009606 

AC009606.4 GI:12484386 
HTG . 

thale cress. 
Arabidopsis thaliana 

Eukaryota; Viridiplantae; Embryophyta; Tracheophyta; Spermatophyta 
Magnoliophyta; eudicotyledons ; core eudicots; Rosidae; eurosids II 
Brassicales; Brassicaceae ; Arabidopsis . 

1 (bases 1 to 91924) 

Lin,X., Kaul,S., Town, CD., Benito, M., Creasy, T.H., Haas,B., Wu,D. 
Maiti,R., Ronning, C . M . , Koo,H., Fujii,C.Y., Utterback, T . R . , 
Barnstead,M.E. , Bowman, C.L., White, 0., Nierman,W.C. and Fraser,C.M 
Arabidopsis thaliana chromosome III BAC F22F7 genomic sequence 
Unpublished 

2 (bases 1 to 91924) 
Lin,X. and Kaul,S. 
Direct Submission 

Submitted (28-AUG-1999) The Institute, for Genomic Research, 9712 
Medical Center Dr, Rockville, MD 20850, USA, xlin@tigr.org 

3 (bases 1 to 91924) 
Lin,X. 

Direct Submission 

Submitted ( 25- JAN-2001 ) The Institute for Genomic Research, 9712 
Medical Center Dr., Rockville, MD 20850, USA 
On Jan 25, 2001 this sequence version replaced gi:12280792. 
Address all correspondence to : at@tigr . org 

BAC clone F22F7 is from Arabidopsis chromosome III and is near the 
molecular marker mil72. 

The orientation of the sequence is from SP6 to T7 end of the BAC 
clone . 

Genes were identified by a combination of three methods: Gene 
prediction programs including GRAIL (available by anonymous ftp 
from arthur.epm.ornl.gov), Genefinder (Phil Green, University of 
Washington) , Genscan (Chris Burge, 

http: //gnomic. stanford.edu/-chris/GENSCANW. html) , and NetPlantGene 
(http://www.cbs.dtu.dk/netpgene/cbsnetpgene.html), searches of the 
complete sequence against a peptide database and the Arabidopsis 
EST database at TIGR (http://www.tigr.org/tdb/at/at.html). 
Annotated genes are named to indicate the level of evidence for 



their annotation. Genes with similarity to other proteins are named 
after the database hits. Genes without significant peptide 
similarity but with EST similarity are named as 'unknown 1 proteins. 
Genes without protein or EST similarity, that are predicted by more 
than two gene prediction programs over most of their length are 
annotated as 'hypothetical 1 proteins. Genes encoding tRNAs are 
predicted by tRNAscan-SE (Sean Eddy, 

http://genome.wustl.edu/eddy/tRNAscan-SE/). Simple repeats are 
identified by repeatmasker (Arian Smit, 

http : / /f tp . genome . Washington . edu/RM/RepeatMasker . html ) . 
FEATURES Location/Qualifiers 
source 1. .91924 

/organism="Arabidopsis t ha liana" 

/ cult ivar= "Columbia" 

/ db_x r e f - " t a xon : 3 7 0 2 " 

/ c h r omo s ome = " 1 1 1 " 

/map="mil72" 

/clone="F22F7" 
misc_feature 1. .14310 

/note="the annotation for genes within this region can be 

found in the overlapping bac F18C1 sequence 86291-100600" 
mRNA complement (join(14688. .14 927,15027. .15167,15254. .15373, 

15468. .15555,15643. .157 61,1594 0. .16131,16211. .16402, 

16508. .16615,16717. .16833,17184. .>17303}) 

/gene="F22F7.1" 
gene complement ( 14 688 . ,>17303) 

/gene="F22F7 .1" 

/note="identical to GB:AAF22525 from [Arabidopsis 
thaliana] " 

• CDS complement (join (14850. .14 927,15027. .15167,15254. .15373, 

15468. .15555, 15643. .157 61,15940." .16131, 16211. .16402, 
16508. .16615,16717. .16833,17184. .17303)) 
/gene="F22F7.1" 
/codon_start=l 

/product="26S proteasome AAA- AT Pa se subunit RPT5a" 
/protein_id-"AAF64530. 1" 
/db_xref ="GI : 7596759" 

/trans la tion="MATPMVEDTSSFEEDQLASMSTEDITRATRLLDNEIRILKEDAQ 

RTNLECDSYKEKIKENQEKIKLNKQLPYLVGNIVEILEMNPEDDAEEDGANIDLDSQR 

KGKCVVLKTSTRQTIFLPVVGLVDPDSLKPGDLVGVNKDSYLILDTLPSEYDSRVKAM 

EVDEKPTEDYNDIGGLEKQIQELVEAIVLPMTHKERFEKLGVRPPKGVLLYGPPGTGK 

TLMARACAAQTNATFLKLAGPQLVQMFIGDGAKLVRDAFQLAKEKAPCIIFIDEIDAI 

GTKRFDSEVSGDREVQRTMLELLNQLDGFSSDERIKVIAATNRADILDPALMRSGRLD* 

RKIEFPHPTEEARARILQIHSRKMNVHPDVNFEELARSTDDFNGAQLKAVCVEAGMLA 

LRRDATEVNHEDFNEGIIQVQAKKKASLNYYA" 

repeat_region complement ( 15902 . .15938) 
/rpt_f amily=" (TA) n" 

repeat_region complement ( 1818 1 . .18757) 

/note="ATREP6 | ATREP6 ATREP6 repeat - a consensus." 

repeat_region complement (18184. .18375) 

/note="ATREP9 | ATREP9 ATREP9 repeat - a consensus." 

repeat_region complement ( 19305 . .19353) 
/rpt_family="POLY_A" 

tRNA 19434. .19505 

■ /gene="F22F7.2" 
/product-" tRNA-Gly" 

/anticodon= (pos : 194 67 . .19469, aa: Gly) 



gene 19434. .19505 

/gene="F22F7.2" 

mRNA complement (join (19554. .19816,19914. .1994 9,20031. .20105, 

20246. .20323,20425. .20503,20596. .207 57,20987. .21051, 
21140. .21229,21832. .21947,22115. .22286)) 
7gene="F22F7.3" 

gene complement (19554 . .22286) 

/gene="F22F7.3" 

/note="identical to GB:CAA05054 from [Arabidopsis 
thaliana] " 

CDS complement (join (19709. .19816,19914. .1994 9,20031. .20105, 

20246. .20323,20425. .20503,20596. .207 57,20987. .21051, 
21140. .21229,21832. .21947,22115. .22232)) 
/gene="F22F7.3" 
/codon_start-l 

/product="alpha subunit of F-actin capping protein" 
/protein_id="AAF64531. 1" 
/db_xref="GI: 7596760" 

/translation="MADEEDELLETELSYDQKKEIAKWFFLNAPAGEINYVAKDLKAV 
LSDEEVYNEAAMEAFPVYNKTHMICLEMPSGAGDVIVSSYSEINENEYLDPRTAQVAI 
VDHVKQICTKVRPANDEELPSLYIEEYRYALDAEIQRYVSESYPKGMSAVNCVKGKDN 
EGPGSDFELVVIITAMRLSPQNFCNGSWRSVWNIDFQDESQVLDIKGKLQVGAHYFEE 
GNVELDAKKDFQDSTIFQSADDCAIAIANI IRHHETEYLASLEVAYSKLPDNTFKDLR 
RKLPVTRTLFPWQNTLQFSLTREVEKELGLGK" 

mRNA complement (join( 218 60. .2194 7,22051. .2228 6) ) 

/gene="F22F7.3" 

CDS complement (22107 . .22232) 

/gene="F22F7.3" 
/note="unknown protein" 
/codon_start=l 
/protein_id="AAF64545. 1" 
/db_xref="GI : 7596774" 

/translation="MADEEDELLETELSYDQKKEIAKWFFLNAPAGEINYVAKGT " 
mRNA complement (join (<22769. .23281,237 62. .238 67,23 966. .24084, 

24186. .24383,24484. .24 54 9,24 636. .24 7 4 6,25154. ,>25267)) 

/gene="F22F7.4" 
gene complement ( <227 69 . .>25267) 

/gene="F22F7 .4" 

/note="predicted by genemark . hmm, contains Pfam 
profile : PF01553 Acyltransf erase" 
CDS complement (join (22769. .23281,237 62. .23867,23966. .24084, 

24186. .24383,24484. .24 54 9,24 636. .24 7 4 6,25154. .25267)) 
7gene="F22F7.4" 
/note="hypothetical protein" 
/codon_start=l 
/protein_id="AAF64532. 1" 
/db_xref ="GI : 75967 61" 

/trans lation-"MGIHFVDKADLWKSALLFNLKLRDRFRIAVDDHRGRATDLTAEE 
ESALFRMLQTVAVPLIGNACHVFMNGFNRVQVYGLEKLHDALLNRPKNKPLVTVSNHV 
ASVDDPFVIASLLPPKFLLDARNLRWTLCATDRCFKNPVTSAFSRSVKVLPISRGEGI 
YQQGMDIAISKLNNGGWVHIFPEGSRSRDGGKTMGSAKRGIGRLILDADTLPMVVPFV 
HTGMQDIMPVGASVPRIGKTVTVIIGDPIHFNDILSTEGAQHVSRKHLYDAVSSRIGQ 
RLYDLKAQVDRVYIEQQSMMSHNAKTPSDRAAEIFHRVDWDSFGMGAQFSEESSPSSK 
PIGQSDDRIVRSPKRRVSPEGGVSLKIKKLMDSTEiyiMGFAARGLLMNEYKSRVESANV 
GRPLKAWREYFMNRGL" 

mRNA complement (join (25975. .26514,267 64. .2 6988,272 62. .27395)) 

/gene="F22F7.5" 



gene 



CDS 



repeat_region 

repeat_region 

mRNA 

gene 

CDS 



repeat_region 
repeat_region 
mRNA 

gene 



complement (25975. .27395) 
/gene="F22F7.5" 

/note="similar to stress related protein GB:AAD51854 from 
[Vitis riparia] " 

complement (join (26041. .26514,26764 . .26988,27262. .27303) ) 

/gene="F22F7.5 n 

/codon_start=l 

/product="stress related protein, putative" 
/protein_id="AAF64533. 1" 
/db_xref="GI: 7596762" 

/trans lation="MATQTDLAQPKLDMTKEEKERLKYLQFVQAAAVEALLRFALIYA 

KAKDKSGPLKPGVESVEGAVKTWGPVYEKYHDVPVEVLKYMDQKVDMSVTELDRRVP 

PVVKQVSAQAISAAQIAPIVARALASEVRRAGVVETASGMAKSVYSKYEPAAKELYAN 

YEPKAEQCAVSAWKKLNQLPLFPRLAQVAVPTAAFCSEKYNDTVVKAAEKGYRVTSYM 

PLVPTERISKIFAEEKAETEPLEFHPLD" 

27792. .27922 

/rpt_family=" (CAAAA) n" 

27796. .27842 

/rpt_family="POLY_A" 

complement (<29103. .29551) 

/gene="F22F7 . 6" 

complement (<29103. .29551) 

/gene="F22F7. 6" 

complement (29103. .29462) 

/gene="F22F7.6" 

/note="unknown protein" 

/ codon_start-l 

/protein_id="AAF64534 .1" 

/db_xref="GI : 7596763" 

/translation="MTNTRAIYAVIAILAIVISAVESTGDFGDSLDFVRAGSSSLFSG 

CTGSIAECIAEEEEMEFDSDISRRILAQKKYISYGAMRRNSVPCSRRGASYYNCQRGA 

QANPYSRGCSTITRCRR" 

complement (30238. . 30318) 

/rpt_f ami ly= " POLY_A" 

31392. .31423 

/rpt_family=" POLY_A" 

complement (join (<32605. .32726,32816. .32856,33102. .33246, 

33331. .33627,34033. .34104,34188. .34439,34 681. .34789, 

34872. .34991,35356. .>35460)) 

/gene-"F22F7.7" 

complement (<32605. .>35460) 

/gene="F22F7 .7" 

/note="predicted by genemark . hmm" 



Query Match 14.2%; Score 59.8; DB 8; Length 91924; 

Best Local Similarity 76.8%; Pred. No. 1.3e-05; 

Matches 73; Conservative 0; Mismatches 22; Indels 0; 



Gaps 




0; 



164 aggttgcctctggctaggtgggatcagcagttcgatcgcctacaactggtcgcggcccaa 223 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I II 
7 608 AGGTTGTTTATGGCTAAGTGGTATTACCGGTTCTATCGCCTATAACTGGTCCCAACCTGC 754 9 



Qy 224 tatgaagcctagcgtcaagatcatccacgcaaggt 258 

I I I I I I II I I I I II I I I I I I I I I I I I I I I 
Db 754 8 CATGAAAACCAGTGTCAAGATCATCCACGCCAGGT 7 514 
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ATAC011620A 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 
JOURNAL 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



ATAC011620 100887 bp DNA PLN 24-JAN-2001 

Arabidopsis thaliana chromosome III BAC F18C1 genomic sequence* 
complete sequence. 
AC011620 

AC011620.8 GI:12408732 
HTG. 

thale cress. 
Arabidopsis thaliana 

Eukaryot a ; Viridiplant ae ; Embryophyta ; Tracheophyta ; Spermat ophy ta ; 
Magnoliophyta; eudicotyledons ; core eudicots; Rosidae; eurosids II; 
Brassicales; Brassicaceae ; Arabidopsis . 

1 (bases 1 to 100887) 

Lin,X., Kaul,S., Town, C . D ., .Benito, M . , Creasy, T.H., Haas,B., Wu,D., 
Ronning,C.M. , Koo,H., Fujii,C.Y., Utterback, T . R . , Barnstead, M . E . , 
Bowman, C.L., White, 0., Nierman, W . C . and Fraser,C.M. 
Arabidopsis thaliana chromosome III BAC F18C1 genomic sequence 
Unpublished 

2 (bases 1 to 100887) 
Lin,X. and Kaul,S. 
Direct Submission 

Submitted ( 08-OCT-1999) The Institute for Genomic Research, 9712 
Medical Center Dr, Rockville, MD 20850, USA, xlin@tigr.org 

3 (bases 1 to 100887) 
Lin,X. 

Direct Submission 

Submitted ( 24- JAN-2001 ) The Institute for Genomic Research, 9712 
Medical Center Dr., Rockville, MD 20850, USA 
On Jan 24, 2001 this sequence version replaced gi:12280779. 
Address all correspondence to : at@tigr . org 

BAC clone F18C1 is from Arabidopsis chromosome III and is near the 
molecular marker mil72. 

The orientation of the sequence is from SP6 to T7 end of the BAC 
clone . 

Genes were identified by a combination of three methods: Gene 
prediction programs including GRAIL (available by anonymous ftp 
from arthur.epm.ornl.gov), Genefinder (Phil Green, University of 
Washington) , Genscan (Chris Burge, 

http: //gnomic. stanford.edu/-chris/GENSCANW.html) , and NetPlantGene 
(http://www.cbs.dtu.dk/netpgene/cbsnetpgene.html), searches of the 
complete sequence against a peptide database and the Arabidopsis 
EST database at TIGR (http://www.tigr.org/tdb/at/at.html). 
Annotated genes are named to indicate the level of evidence for 
their annotation. Genes with similarity to other proteins are named 
after the database hits. Genes without significant peptide 
similarity but with EST similarity are named as 'unknown 1 proteins. 
Genes without protein or EST similarity, that are predicted by more 
than two gene prediction programs over most of their length are 
annotated as 'hypothetical 1 proteins. Genes encoding tRNAs are 
predicted by tRNAscan-SE (Sean Eddy, 

http://genome.wustl.edu/eddy/tRNAscan-SE/). Simple repeats are 
identified by repeatmasker (Arian Smit, 

http : //ftp . genome . Washington . edu/RM/RepeatMasker . html ) . 
Location/Qualifiers 
1. .100887 



/organism="Arabidopsis thaliana" 

/cult ivar= "Columbia" 

/db_xref="taxon:3702" 

/chromosome="III" 

/map="mil72" 

/clone="F18Cl" 
misc_f eature complement ( 1 . .12001) 

/note="overlap with BAC clone F10A16 (AC012393:1. .12001)." 
repeat_region 2971. .3040 

/rpt_f amily=" (GAA) n" 
repeat_region complement ( 504 5 . .5109) 

/ rpt_f ami 1 y= " ( GAA ) n " 
repeat__region complement ( 14 590 . .14654) 

/rpt_family="POLY_A" 
mRNA join(<15397. .15585,16169. .16247,1644 9. .16543,16641. 

.16772, 

17050. .17235,17680. .17 84 3,17 919. .17 997,18184. .18312, 
18413. .18540,18736. .>19141) 
/gene="F18Cl .1" 
gene <15397. .>19141 

/gene="F18Cl.l" 

/note="similar to GB:AAC27644" 
CDS join(15397. .15585,16169. .16247,16449. .16543,16641. 

.16772, 

17050. .17235,17680. .1784 3,17 919. .17 997,18184. .18312, 

18413. .18540,18736. .19141) 

/gene="F18Cl.l" 

/codon_start=l 

/product="putative importin alpha" 
/protein_id="AAF26125 . 1" 
/db_xref="GI: 6714438" 

/translation="MKGGETMSVRRSGYKAVVDGVGGRRRREDDMVEIRKAKREESLL 
KKRREALPHSPSADSLDQKLISCIWSDERDLLIEATTQIRTLLCGEMFNVRVEEVIQA 
GLVPRFVEFLTWDDSPQLQFEAAWALTNIASGTSENTEVVIDHGAVAILVRLLNSPYD 
VVREQVVWALGNISGDSPRCRDIVLGHAALPSLLLQLNHGAKLSMLVNAAWTLSNLCR 
GKPQPPFDQVSAALPALAQLIRLDDKELLAYTCWALVYLSDGSNEKIQAVIEANVCAR 
LIGLSIHRSPSVITPALRTIGNIVTGNDSQTQHIIDLQALPCLVNLLRGSYNKTIRKE 
ACWTVSNITAGCQSQIQAVFDADICPALVNLLQNSEGDVKKEAAWAICNAIAGGSYKQ 
IMFLVKQECIKPLCDLLTCSDTQLVMVCLEALKKILKVGEVFSSRHAEGIYQCPQTNV 
NPHAQLIEEAEGLEKIEGLQSHENNDIYETAVKILETYWMEEEEEEDQEQQDMI YFPV 
DNFANMPTSSGTLSEMHCGP" 

repeat__region 19025. .19059 

/rpt_f amily=" (GAA) n" 

mRNA complement (join (19632 . .20011,20101. .20181,202 64. .20329, 

20424. .20501,20581. .20652,2087 4. .20 963,21057. .21333, 
21648. .>21871)) 
/gene="F18C1.2" 

gene complement ( 1 9632 . .>21871) 

/gene="F18C1.2" 

/note="similar to putative syntaxin protein GB:CAB52175 
from [Arabidopsis thaliana]" 
CDS complement (join (19904 . .20011,20101. .20181,20264. .20329, 

20424. .20501,20581. .20652,20874. .20963,21057. .21333, 
21648. .21871)) 
/gene="F18Cl .2" 
/codon_start=l 

/product="putative syntaxin protein, AtSNAP33" 



repeat_region 
repeat_region 



mRNA 
,24328, 



gene 

CDS 
,24328, 



repeat_region 



mRNA 
,29754, 



gene 



CDS 
,30211) 



mRNA 
.38069, 



/protein_id="AAF26126. 1" 
/db_xref="GI: 6714439" 

/trans la tion="MATRNRTLLFRKYRNSLRSVRAPMGSSSSSTLTEHNSLTGAKSG 

LGPVIEMASTSLLNPNRSYAPVSTEDPGNSSRGTITVGLPPDWVDVSEEISVYIQRAR 

TKMAELGKAHAKALMPSFGDGKEDQHQIETLTQEVTFLLKKSEKQLQRLSAAGPSEDS 

NVRKNVQRSLATDLQNLSMELRKKQSTYLKRLRLQKEDGADLEMNLNGSRYKAEDDDF 

DDMVFSEHQMSKIKKSEEISIEREKEIQQVVESVSELAQIMKDLSALVIDQGTIVDRI 

DYN IQNVAST VDDGLKQLQKAERTQRQGGMVMCAS VLVI LCFIMLVLLI LKE I LL " 

22133. .22223 

/rpt_f amily=" (TAAA) n" 

22179. .22226 

/ rp t_f ami 1 y= " POLY_A " 

join(<22810. .22885, 234 54 . . 23596, 2388 4 . . 23922 , 24 150 . 



24593. .>24701) 
/.gene="F18C1.3" 
<22810. .>24701 
/gene="F18C1.3" 
join(22810. .22885, 23454 . 



.23596,23884. .23922,24150, 



24593. .24701) 
/gene="F18C1.3" 
/note="unknown protein" 
/codon_start=l 
/protein_id="AAF26127 .1" 
/db_xref="GI: 6714440" 

/trans lation="MDSDSWSDRLASATRRYQLAFPSRSDTFLGFEEIDGEEEFREEF 

ACPFCSDYFDIVSLCCHIDEDHPMEAKNGVCPVCAVRVGVDMSLFGGSSCIVSSSSSS 

NVAADPLLSSFISPIADGFFTTESCISAETGPVKKTTIQCLPEQNAKKTSLSAEDHKQ 

KLKRSEFVRELLSSTILDDSL" 

complement (27249. .27313) 

/rpt_family=" (TAAAA) n" 

join (28290 . .28335, 2874 4 . . 2 9013, 2 9330 . . 2 94 24 , 29536 . 



29843. .30597) 
/gene="F18Cl. 4" 
28290. .30597 
/gene="F18C1.4" 

/note="similar to GB:CAA74049" 

join (28809. . 2 9013, 2 9330 . .29424,2 9536. 



.29754,29843. 



/gene="F18C1.4" 
/codon_start=l 

/product="putative transcription factor" 
/protein_id="AAF26128 . 1 " 
/db_xref="GI: 6714441" 

/translation-"MAMQTVREGLFSAPQTSWWTAFGSQPLAPESLAGDSDSFAGVKV 
GSVGETGQRVDKQSNSATHLAFSLGDVKSPRLVPKPHGATFSMQSPCLELGFSQPPIY 
TKYPYGEQQYYGVVSAYGSQSRVMLPLNMETEDSTI YVNSKQYHGIIRRRQSRAKAAA 
VLDQKKLSSRCRKPYMHHSRHLHALRRPRGSGGRFLNTKSQNLENSGTNAKKGDGSMQ 
IQSQPKPQQSNSQNSEVVHPENGTMNLSNGLNVSGSEVTSMNYFLSSPVHSLGGMVMP 
SKWIAAAAAMDNGCCNFKT " 

join (36087. . 36254 , 3677 7 . . 37 418, 37 507 . .37875,37 965. 

38171. .38353,38529. .38 670,38758. .38939,39026. .39111, 
39212. .39260,39735. .39801,39900. .40385,40434. .40540, 
41173. .41240,41336. .41552,41793. .42008,42111. .42266, 



42340. .42393,42479. .42 687,42782. .42845,43070. .43389, 
43742. .43925,44024. .4 6030,4 6087. .4 6203,4 6311. .46581) 
/gene="F18C1.5" 
gene 36087. .46581 . 

/gene="F18C1.5" 

CDS join(36093. .36254,36777. .37418,37507. .37875,37965. 

.38069, 

38171. .38353,38529. .38670,38758. .38939,39026. .39111, 

39212. .39260,39735. .39801,39900. .40385,40434. .40540, 

41173. .41240,41336. .41552,41793. .42008,42111. .42266, 

42340. .42393,42479. .42687,42782. .42845,43070. .43389, 

43742. .43925,44024. .4 6030,4 6087. .4 6203,4 6311. .46331) 

/gene="F18C1.5" 

/note="unknown protein" 

/codon_start=l 

/protein_id="AAF2 612.9. 1" 

/db_xref="GI: 6714442" 

/translat ion="MVRYLNVDNIFVRATSPPSFALEVFVRCEGESKFKRLCNPFLYT 
PSAPYPLEVEAVVTNHLVVRGSYRSLSLIVYGNIVKDLGQYNIILEGRSVTDIVSSTE 
GNLEDLPLVLHSVNRTIEECLSSLDIVSLPLAAVDLPVEVKRLLQLLLKIFDKLATND 
VVNKFVDTVVSGVSSYVTDNVDFFLKNKNCSAVTSSLDSGLFHDIVDRVKEDILDLNE 
IQESDVALGLFSFLESETYLATSQQLWMLSPYIQFERDSLCTVLPKLSKGKATLLGL 
SLAFLLCSGREGCLQFVNSGGMDQLVYLFGHDGQNSTTITLLLLGVVEQATRHSVGCE 
GFLGWWPREDGSIPSGKSEGYCLLLKLLMQKPCHEIASLAIYILRRLRIYEVISRYEF 
AVLSALEGLSNSHGAATHNLNMLSDAKSQLQKLQNLMKSLGSVEDPSPSAYAERSLVS 
DHSEGWLSYKATSKLTSSWTCPFYSSGIDSHILALLKERGFLPLSAALLSMPELHSKV 
GDIMDVFTDIAMFIGNIILSFMFSRTGLSFLLHHPELTATIIQSLKGSVDLNKEECVP 
LHYASILISKGFTCSLLEIGINLEMHLRVVSAVDRLLKSIQQTEEFLWILWELRDVSR 
S DCGREALLTLGVFPEALAVLIEALHSAKDMEPAVENSGIS PLNLAICHSAAE I FEVI 
VSDSTASCLHAWIEHAPVLHKALHTLSPGGSNRKDAPSRLLKWIDAGVVYHKHGVGGL 
LRYAAVLASGGDAQLS S SSI LALDLT PAENGAGESTNVSEMN VLDNLGKVI FEKS FEG 
VNLSDSSISQLTTALRILALISDNSVYAIVFLFKCSEMLFFVQTVAAALYDEGAVTVV 
YAILGTKEQYRNTKLMKALLRLHREVSPKLAACAADLSSHYPDSALGFGAVCHLIVSA 
LVCWPVYGWIPGLFHTLLSGVQTSSVPALGPKETCSFLCILSDILPEEGVWFWKSGMP 
LLSGLRKLAVGTLMGPQKEKQINWYLEPGPLEKLINHLTPNLDKIAKIIQHHAVSALV 
VIQDMLRVFIVRIACQRVEHASILLRPIFSSIRDGILDQSSTRDTEAYMVYRYLNFLA 
SLLEHPHAKGLLLEEGIVQLLVEVLERCYDATYPSENRVLEYGIVSASSVIQWCIPAF 
RSISLLCDSQVPLLCFQKKELLASLSAKDCALIFPFVLKFCQVLPVGNELLSCLGAFK 
DLSSCGEGQDGLVSLLFHLFSGTEESVSERWCDTNSLSLDQLDMKKNPPFLSCWIKLL 

Query Match 14.2%; Score 59.8; DB 8; Length 100887; 

Best Local Similarity 76.8%; Pred. No. 1.3e-05; 

Matches 73; Conservative 0;. Mismatches 22; Indels 0; Gaps 0; 

Qy 164 aggttgcctctggctaggtgggatcagcagttcgatcgcctacaactggtcgcggcccaa 223 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II 
Db 93898 AGGTTGTTTATGGCTAAGTGGTATTACCGGTTCTATCGCCTATAACTGGTCCCAACCTGC 93839 

Qy 224 tatgaagcctagcgtcaagatcatccacgcaaggt 258 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I 
Db 93838 CAT G AAAA C C AG T G T C AAG AT C AT C C AC G C C AG G T 93804 



RESULT 3 
G69921 

LOCUS G69921 255 bp DNA STS 08-JUN-2001 

DEFINITION 695251131FN194 maize leaf DNA Zea mays STS genomic, sequence tagged 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



site . 
G69921 

G69921.1 GI:14331606 
STS. 

Zea mays . 
Zea mays 

Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophytc 
Spermatophyta; Magnoliophyta; Liliopsida; Poales; Poaceae; PACC 
clade; Panicoideae; Andropogoneae ; Zea. 
1 (bases 1 to 255) 

Yang, Y . J. , Guo,L., Ashlock, D. A. , Wen, T.J. and Schnable , P . S . 

3 1 UTR sequences of maize genes 

Unpublished 

Contact: Schnable, P.S. 
Schnable laboratory 
Iowa State University 

G405 Agronomy Hall, Ames, IA 50011, USA 

Tel: 515-294-0975 

Fax: 515-294-2299 

Email: schnable@iastate.edu 

Primer A: CGAGGGAGTGGGTGGTC 

Primer B: TGGTACCATCTGCATACACAAC 

PCR Profile: 

Denaturation : 94 degrees C for 30 seconds 
Annealing: 60 degrees C for 45 seconds 

Polymerization: 72 degrees C for 90 seconds 
PCR cycles: 31 
Thermal cycler: Perkin Elmer TC 
Protocol : 

Template: 10-20 ng 

Primer: each 0.5 uM 

dNTPs: each 200 uM 

Taq Polymerase: 0.05 units/ul 

Total vol: 20 ul 



Buffer: 



FEATURES 

source 



STS 
BASE COUNT 
ORIGIN 



MgC12: 2 mM 
KC1: 50 mM 
Tris-HCl: 20 mM 
pH: 8.4. 

Location/Qualifiers 

1. .255 

/organism=" Zea mays" 
/strain="DE811" 
/db_xref =" taxon : 4 57 7 " 
/clone_lib«"maize leaf DNA" 

/note^"PCR products amplified from genomic DNA 11 
<1. .>255 
46 a 82 c 65 g 62 t 



Query Match 13.3%; 
Best Local Similarity 71.2%; 
Matches 74; Conservative 



Score 56; DB 11; Length 255; 
Pred. No. 0.00023; 
0; Mismatches 30; Indels 0; Gaps 



Qy 160 ccgtaggttgcctctggctaggtgggatcagcagttcgatcgcctacaactggtcgcggc 219 

I I I I I I I I I I I I I I I I I I I I I I I I ! I I I I I I I I I I I I I I III 
Db 66 CTGCAGGGACGCTGTGGCTCAGCGGCATCGTCGGCTCCATCGCCTACAACTGGTCCAGGC 125 

Qy 220 ccaatatgaagcctagcgtcaagatcatccacgcaaggttgcat 263 

I I I I I I I I 1 I I I I I I i I I I I I I I I I I I I I I II 
Db 12 6 CGGGCATGAAAACCAACGTCAAGATCATCCACGCCAGGTCCCCT 169 



RESULT 4 

G69610 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



G69610 387 bp DNA STS 08-JUN-2001 

695251131FB113 maize leaf DNA Zea mays STS genomic, sequence tagged 
site . 
G69610 

G69610.1 GI:14331295 
STS. 

Zea mays . 
Zea mays 

Eukaryota; Viridiplantae ; Streptophyta ; Embryophyta; Tracheophyta ; 
Spermatophyta; Magnoliophyta; Liliopsida; Poales; Poaceae; PACC 
clade; Panicoideae; Andropogoneae; Zea. 
1 (bases 1 to 387) 

Yang, Y. J., Guo,L., Ashlock, D . A. , Wen, T.J. and Schnable, P . S . 

3' UTR sequences of maize genes 

Unpublished 

Contact: Schnable, P.S. 
Schnable laboratory 
Iowa State University 

G405 Agronomy Hall, Ames, IA 50011, USA 

Tel: 515-294-0975 

Fax: 515-294-2299 

Email : schnable@iastate . edu 

Primer A: CGAGGGAGTGGGTGGTC 

Primer B: TGGTACCATCTGCATACACAAC 

PCR Profile: 

Denaturation: 94 degrees C for 30 seconds 
Annealing: 60 degrees C for 45 seconds 

Polymerization: 72 degrees C for 90 seconds 
PCR cycles: 31 
Thermal cycler: Perkin Elmer TC 
Protocol : 

Template: 10-20 ng 

Primer: each 0.5 uM 

dNTPs: each 200 uM 

Taq Polymerase: 0.05 units/ul 

Total vol: 20 ul 



Buffer: 



FEATURES 

source 



MgC12: 2 mM 
KC1: 50 mM 
Tris-HCl: 20 mM 
pH: 8.4. 

Location/Qualifiers 

1. .387 

/organism="Zea mays" 



/strain= ,, DE811 n 
/db_xref="taxon: 4577" 
/clone_lib="maize leaf DNA" 

/note="PCR products amplified from genomic DNA" 
STS <1. .>387 

BASE COUNT 79 a 106 c 91 g 111 t 

ORIGIN 



Query Match 13.3%; Score 56; DB 11; Length 387; 

Best Local Similarity 71.2%; Pred. No. 0.00022; 

Matches 74; Conservative 0; Mismatches 30; Indels 0; Gaps 

Qy 160 ccgtaggttgcctctggctaggtgggatcagcagttcgatcgcctacaactggtcgcggc 219 

I I I I I I I I I I! I I I I I I I I I I I I I I I I I I I I I I I I I I I III 
Db 87 CTGCAGGGACGCTGTGGCTCAGCGGCATCGTCGGCTCCATCGCCTACAACTGGGCCAGGC 14 6 

Qy 220 ccaatatgaagcctagcgtcaagatcatccacgcaaggttgcat 2 63 

I I I I I II I I I I I I I I I I I I I I I II I I I I I II II 
Db 14 7 CGGGCATGAAGACCAGCGTTAAGATCATCCACGCCAGGTCCCCT 190 



RESULT 5 

G70259 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



G70259 368 bp DNA STS 08-JUN-2001 

695251131FB73-t j maize leaf DNA Zea mays STS genomic, sequence 
tagged site. 
G70259 

G70259.1 GI:14331944 
STS. 

Zea mays. 
Zea mays 

Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta 
Spermatophyta ; Magnoliophyta ; Liliopsida; Poales; Poaceae; PACC 
clade; Panicoideae; Andropogoneae ; Zea. 
1 (bases 1 to 368) 

Yang, Y, J., Guo,L., Ashlock, D . A. , Wen, T.J. and Schnable, P . S . 

3 1 UTR sequences of maize genes 

Unpublished 

Contact: Schnable, P.S. 
Schnable laboratory 
Iowa State University 

G405 Agronomy Hall, Ames, IA 50011, USA 

Tel: 515-294-0975 

Fax: 515-294-2299 

Email : schnable@ia state . edu 

Primer A: CGAGGGAGTGGGTGGTC 

Primer B: TGGTACCATCTGCATACACAAC 

PCR Profile: 

Denaturation : 94 degrees C for 30 seconds 

Annealing: 60 degrees C for 45 seconds 

Polymerization: 72 degrees C for 90 seconds 

PCR cycles: 31 

Thermal cycler: Perkin Elmer TC 
Protocol : 

Template: 10-20 ng 



Primer: each 0 . 5 uM 

dNTPs: each 200 uM 

Taq Polymerase: 0.05 units/ul 

Total vol: 20 ul 



Buffer: 



FEATURES 

source 



STS 
BASE COUNT 
ORIGIN 



MgC12: 2 mM 
KC1: 50 mM 
Tris-HCl: 20 mM 
pH: 8.4. 

Location/Qualifiers 

1. .368 

/organism="Zea mays" 
/strain="DE811" 
/db_xref ="taxon : 4 577 " 
/clone_lib="maize leaf DNA" 

/note="PCR products amplified from genomic DNA" 
<1. .>368 
80 a 96 c 86 g • 106 t 



Query Match 12.9%; Score 54.4; DB 11; Length 368; 

Best Local Similarity 70.2%; Pred. No. 0.0006; 

Matches 73; Conservative 0; Mismatches 31; Indels 0; Gaps 0; 

Qy 160 ccgtaggttgcctctggctaggtgggatcagcagttcgatcgcctacaactggtcgcggc 219 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I III 
Db 68 CTGCAGGGACGCTGTGGCTCAGCGGCATCGTCGGCTCCATCGCCTACAACTGGTCCAGGC 127 

Qy 220 ccaatatgaagcctagcgtcaagatcatccacgcaaggttgcat 263 

I I I I I I I I M I II I I I I I I I I I I I I I I I I II 
Db 128 CGGGCATGAAAACCATCGTTAAGATCATCCACGCCAGGTCCCCT 171 



RESULT 6 
AF180335/C 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
REFERENCE 
AUTHORS 
TITLE 



AF180335 27588 bp DNA PLN 28-SEP-1999 

Glycine max malate dehydrogenase (Mdhl) gene, complete cds; nuclear 
gene for mitochondrial product. 
AF180335 

AF180335.1 GI:5929963 

soybean. 
Glycine max 

Eukaryota ; Viridiplantae ; St reptophyt a ; Embryophyta ; Tracheophyta ; 
Spermatophyta; Magnoliophyta; eudicotyledons ; core eudicots; 
Rosidae; eurosids I; Fabales; Fabaceae; Papilionoideae; Phaseoleae; 
Glycine . 

1 (bases 1 to 27588) 

Imsande,J., Pittig, J., Palmer,R.G. and Gietl,C. 
Independent spontaneously induced mitochondrial malate 
dehydrogenase null mutants in soybean are the result of a deletion 
Unpublished 

2 (bases 1 to 27588) 

Imsande,J., Pittig,J., Palmer, R.G. and Gietl,C. 
Direct Submission 



JOURNAL Submitt 
Univers 

FEATURES 

source 



repeat_region 
misc_f eature 

repeat_region 
repeat_region 
mRNA 

gene 
CDS 



ed (24-AUG-1999) Plant Genetics, Agronomy, Iowa State 
ity, Ames, IA 50011, USA 

Location/Qualifiers 

1. .27588 

/organism= M Glycine max" 
/cultivar= "Williams " 
/db_xr e f = " t axon : 3 8 4 7 " 
981. .1135 
/rpt_type=dispersed 
4113. .4345 

/note="similar to AMP binding proteins deposited in 
GenBank Accession Number CAB10186 and encoded by GenBank 
Accession Number Z72151" 
5039. .5235 
/rpt_type=dispersed 
5865. .6065 
/rpt_type=dispersed 
join(<7254. .7508,8542. 



repeat_region 
repeat_region 
misc_f eature 
repeat_region 
repeat_region 
misc_f eature 

repeat_region 



,8715,9167. .9335,9616. .9677, 



, .9335,9616. 
,10704) 



. 9677, 



9783. .9904,10062. .10179,10567. .>10704) 
/gene="Mdhl" 

/product="malate dehydrogenase" 
<7254. .>10704 
/gene="Mdhl" 

join(7254. .7508,8542. .8715,9167, 
9783. .9904,10062. .10179,10567. , 
/gene="Mdhl" 
/codon_start=l 

/product="malate dehydrogenase" 
/protein_id="AAD56659. 1" 
/db_xref ="GI : 5929964 " 

/trans lation="MMKPSMLRSLHSAATRGASHLFRRGYASEPVPERKVAVLGAAGG 

IGQPLSLLMKLNPLVSSLSLYDIAGTPGVAADISHINTRSEVVGYQGDEELGKALEGA 

DVVIIPAGVPRKPGMTRDDLFNINAGIVKTLCTAIAKYCPHALVNMISNPVNSTVPIA 

AEVFKKAGTYDEKRLFGVTTLDVVRAKTFYAGKANVPVAGVNVPVVGGHAGITILPLF 

SQATPKANLDDDVIKALTKRTQDGGTEVVEAKAGKGSATLSMAYAGALFADACLKGLN 

GVPDVVECSFVQSTVTELPFFASKVRLGTVGVEEVLGLGHLSDFEQQGLESLKPELKS 

SIEKGIKFANQ" 

11875. .12241 

/note="similar to a barley retrotransposon" 

/rpt_type=dispersed 

15742. .16109 

/note="similar to a barley retrotransposon" 
/rpt_type=dispersed 

complement (join (16851. . 17206, 17 8 65 . .18164) ) 

/note="similar to antisense of soybean catalase encoded by 

GenBank Accession Number Z12021" 

18120. .18135 

/rpt_type=tandem 

/rpt_unit=1812. .18127 

18141. .18156 

/ rpt_type=tandem 

/rpt_unit=18141. .1814 8 

complement (2 6350. .27063) 

/gene="gag-pol" 

/note="gag-pol protein with internal stop codon; 
transposition of retrotransposon" 
complement (<26350. .>27063) 



/note="retrotransposon" 
/rpt_type=dispersed 
gene complement ( 2 6350 . .27063) 

/gene="gag-pol n 

BASE COUNT 8635 a 5055 c 5183 g 8665 t 50 others 

ORIGIN 



Query Match 12.4%; Score 52; DB 8; Length 27588; 

Best Local Similarity 70.0%; Pred. No. 0.0017; 

Matches 70; Conservative 0; Mismatches 30; Indels 0; Gaps 0; 

Qy 164 aggttgcctctggctaggtgggatcagcagttcgatcgcctacaactggtcgcggcccaa 223 

I I I I I I I I I I I I I II I I I I I I I I I I I I M I I I I I I I I I I I I I II 

Db 6194 AGGGTGTCTGTGGCTCAGTGGTATTTCGCGTTCAATCGCTTACAATTGGTCTCGACCCAA 6135 

Qy 224 tatgaagcctagcgtcaagatcatccacgcaaggttgcat 2 63 

II I I I I I I I I I I I I I I I I I I I I I I II 

Db 6134 CATGAAAATCGGTGTTAGGATCATTCACGCAAGTTCCTAT 6095 



RESULT 7 

AC069556 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



COMMENT 



AC069556 93672 bp DNA PLN 27-JUL-2000 

Genomic Sequence For Arabidopsis thaliana Clone T1G16 From 
Chromosome V, complete sequence. 
AC069556 

AC069556.4 GI:9502396 
HTG. 

thale cress. 
Arabidopsis thaliana 

Eukaryota; Viridiplantae ; Streptophyta; Embryophyta; Tracheophyta ; 
Spermatophyta; Magnoliophyta; eudicotyledons ; core eudicots; 
Rosidae; eurosids II; Brassicales; Brassicaceae ; Arabidopsis. 

1 (bases 1 to 93672) 

Rodriguez, M. A. , Nascimento, L . U . , de la Bastide,M., Preston, R. R. , 
Huang, E.N. , See,L.H., Spiegel , L . A . , Baker, J. P., Vil,M.D., 
Shah,R.S., Bahret,A., King,L., Kirchof f , K. A. , Miller, B., 
Shekher,M., Toth,K., O 1 Shaughnessy, A . , Dedhia,N.N. and 
McCombie, W.R. 

Genomic Sequence For Arabidopsis thaliana Clone T1G16 From 

Chromosome V, Complete Sequence 

Unpublished 

2 (bases 1 to 93672) 
McCombie, W. R. 
Direct Submission 

Submitted ( 03- JUN-2000 ) Lita Annenberg Hazen Genome Sequencing 
Center, Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring 
Harbor, NY 11724, USA 

3 (bases 1 to 93672) 
McCombie, W.R. 
Direct Submission 

Submitted (27-JUL-2000) Lita Annenberg Hazen Genome Sequencing 
Center, Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring 
Harbor, NY 11724, USA 

On Jul 27, 2000 this sequence version replaced gi: 8698718. 

This sequence was finished as follows unless otherwise noted: all 



FEATURES 

source 



misc feature 



BASE COUNT 
ORIGIN 



regions were either double-stranded or sequenced with an alternate 
chemistry or covered by high quality data (i.e., phred quality >= 
30); an attempt was made to resolve all sequencing problems, such 
as compressions and repeats; all regions were covered by at least 
one plasmid subclone or more than one M13 subclone; and the 
assembly was confirmed by restriction digest. 

Bases 1-18896 overlap with F15A18 (Accession # AC007478) . The 
overlap is from 89035-107931 on F15A18. Bases 90723-93672 overlap 
with F14I23 (Accession # AC007399) . The overlap is from 1-2949 on 
F14I23. 

Location/Qualifiers 
1. .93672 

/ organ ism="Arabidops is tha liana" 
/db_xref="taxon:3702" 
/chromosome="V" 
/clone="TlG16" 
/clone_lib="TAMU" 
25520. .25590 

/note="This region is covered by one PCR product of high 
quality data . " 
30853 a 15991 c 16091 g 30737 t 



Query Match 12.2%; Score 51.2; DB 8; Length 93672; 

Best Local Similarity 70.8%; Pred. No. 0.0024; 

Matches 68; Conservative 0; Mismatches 28; Indels 0; Gaps 0; 

Qy 163 taggttgcctctggctaggtgggatcagcagttcgatcgcctacaactggtcgcggccca 222 

I I I 1 I I I I 1 I I I I II MM III I I II II II II II I I I II II 
Db 5104 4 TAGGTTGCTTATGGCTAAGTGGTATCTCTGGTTCAATTGCTTATAATTGGTCTAAACCTG 51103 

Qy 223 atatgaagcctagcgtcaagatcatccacgcaaggt 258 

I II II I II I I I I II I II I I I I I I I I II 
Db 51104 CCATGAAAACCAGTGTCAGAATCATCCACGCTAGGT 51139 



RESULT 8 

I66494/c 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

FEATURES 

source 

BASE COUNT 
ORIGIN 



166494 
Sequence 
166494 
166494 . 1 



14 



7218 bp 
from patent 



DNA 

US 5670367, 



PAT 



28-DEC-1997 



GI:2724471 



Unknown . 
Unknown . 
Unclassified . 
1 (bases 1 to 7218) 
Dorner,F., Scheif linger , F . 
Recombinant fowlpox virus 
Patent: US 5670367-A 14 23-SEP-1997; 

Location/Qualifiers 

1. .7218 

/organism= "unknown" 
1944 a 1491 c 1486 g 1929 t 



and Falkner, F.Gunter . 



368 others 



Query Match 11.8%; Score 49.6; DB 6; Length 7218; 

Best Local Similarity 6.4%; Pred. No. 0.0083; 

Matches 10; Conservative 106; Mismatches 40;- Indels 0; Gaps 



0; 



Qy 


1 


gaaaaaaataactcggaaaagaaggagacgccgaaaattcgaaaggggaggggaaagcaa 


60 


Db 


1210 


RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR 


1151 


Qy 


61 


agctgatggcggaggcccaggggaaagcaaagcaaatggcggaggccccgagcaagatcg 


120 


Db 


1150 


RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR 


1091 


Qy 


121 


aatccatgaggaagtgggtcgtcgagcacaagctcc 156 
: : ::::::::::: : : I 1 1 1 ! 1 1 1 i 1 
RRRRRRRRRRRRRRRRRRRRRRRRATCGCAAGCTCC 1055 




Db 


1090 





RESULT 9 

CBRG45011/C 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



CBRG45011 26703 bp DNA INV 04-NOV-2000 

Caenorhabditis briggsae cosmid G45011, complete sequence. 
AC084652 

AC084652.1 GI:11095098 
HTG. 

Caenorhabditis briggsae. 
Caenorhabditis briggsae 

Eukaryota; Metazoa; Nematoda; Chromadorea; Rhabditida; 
Rhabditoidea; t Rhabditidae ; Peloderinae; Caenorhabditis. 

1 (bases 1 to 26703) 

Washington University Genome Sequencing Center. 
The C. briggsae Genome Sequencing Project 
Unpublished 

2 (bases 1 to 26703) 
Waterston, R. 

Direct Submission 

Submitted ( 04-NOV-2000 ) Department of Genetics, Washington 
University, 4444 Forest Park Avenue, St. Louis, -Missouri 63108, USA 
Submitted by: 

Genome Sequencing Center 

Department of Genetics, Washington University, 
St. Louis, MO 63110, USA 
e-mail: jspieth@watson.wustl.edu 



NOTICE: This sequence may not be the entire insert of this clone. 
It may be shorter because we only sequence overlapping sections 
once, or longer because we provide a small overlap between 
neighboring submissions. 
FEATURES Location/Qualifiers 
source 1. .26703 

/organism="Caenorhabditis briggsae" 
/strain="GujArat G16" 
/db_xr e f = " t axon : 6 2 3 8 " 
/clone="G45011" 
BASE COUNT 5831 a 6601 c 4766 g 9505 t 

ORIGIN 



Query Match 10.8%; 
Best Local Similarity 56.8%; 
Matches 84; Conservative 



Score 4 5.6; DB 3; 
Pred. No. 0.083; 
0; Mismatches 64; 



Length 26703; 



Indels 



0; Gaps 



0; 



Qy 18 aaagaaggagacgccgaaaattcgaaaggggaggggaaagcaaagctgatggcggaggcc 77 

I I I I I I I I I I I I I I III I I I I I I I I I I II I I I I I I 
Db 7 364 AAAAAGGAAGCTGACGAAAAGGCCAAGAAGGAAGCCGAGGCTAAGACTAAGAAGGAGGCT 7305 

Qy 78 caggggaaagcaaagcaaatggcggaggccccgagcaagatcgaatccatgaggaagtgg 137 

II II II I I I I I I I I I I I I II II II I I I I I I I I I 
Db 7304 GATGATAAGGCCAGGAAGGAGGCCGAGGCCAAGAAGGAGGCCGAGGCCAAGAAGGAGGCT 7 24 5 

Qy 138 gtcgtcgagcacaagctccgagccgtag 165 

I I I I I I MM II I I I I 

Db 724 4 GACGACAAGGCCAAGAAGGAGGCTGTAG 7 217 



RESULT 10 

AC084066 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



AC084066 235411 bp DNA HTG 12-OCT-2000 

Mus musculus clone RP23-321D1, *** SEQUENCING IN PROGRESS ***, 29 
unordered pieces. 
AC084066 

AC084066.1 GI:10799415 
HTG; HTGS_PHASE1. 
house mouse. 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

1 (bases 1 to 235411) 

DOE Joint Genome Institute. 
Sequencing of Mouse 
Unpublished 

2 (bases 1 to 235411) 

DOE Joint Genome Institute. 
Direct Submission 

Submitted ( 12-OCT-2000 ) Production Sequencing Facility, DOE Joint 
Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA 

Genome Center 

Center: Joint Genome Institute 
Center Code: JGI 

Web site: http://www.jgi.doe.gov 

Project Information 

Center Project Name: 2351294 

Center clone name: RPCI-23_32 1D1 

Summary Statistics 

Consensus quality: 214207 bases at least Q40 

Consensus quality: 223053 bases at least Q30 

Consensus quality: 225208 bases at least Q20 

Estimated insert size: 200000; pulse field gel estimation 

Estimated insert size: 232611; sum-of-contigs estimation 

Quality coverage: 11.56 in Q20 bases; pulse field gel estimation 

Quality coverage: 9.94 in Q20 bases; sum-of-contigs estimation. 



NOTE: This is a 'working draft 1 sequence. It currently 
consists of 29 contigs. The true order of the pieces 
is not known and their order in this sequence record is 

arbitrary. Gaps between the contigs are represented as 

runs of N, but the exact sizes of the gaps are unknown. 
This record will be updated with the finished sequence 

as soon as it is available and the accession number will 
be preserved. 

1 1210: contig of 1210 bp in length 

1211 1310: gap of unknown length 

1311 3240: contig of 1930 bp in length 

3241 3340: gap of unknown length 

3341 4504: contig of 1164 bp in length 

4505 4604: gap of unknown length 

4605 6172: contig of 1568 bp in length 

6173 6272: gap of unknown length 

6273 7354: contig of 1082 bp in length 

7355 7454: gap of unknown length 

7455 8625: contig of 1171 bp in length 

8626 8725: gap of unknown length 

8726 10114: contig of 1389 bp in length 

10115 10214: gap of unknown length 

10215 12091: contig of 1877 bp in length 

12092 12191: gap of unknown length 

12192 14113: contig of 1922 bp in length 

14114 14213: gap of unknown length 

14214 15286: contig of 1073 bp in length 

15287 15386: gap of unknown length 

15387 17839: contig of 2453 bp in length 

17840 17939: gap of unknown length 

17940 20266: contig of 2327 bp in length 

20267 20366: gap of unknown length 

20367 22001: contig of 1635 bp in length 

22002 22101: gap of unknown length 

22102 24307: contig of 2206 bp in length 

24308 24407: gap of unknown length 

24408 26405: contig of 1998 bp in length 

26406 26505: gap of unknown length 

26506 29691: contig of 3186 bp in length 

29692 29791: gap of unknown length 

29792 34705: contig of 4914 bp in length 

34706 34805: gap of unknown length 

34806 39749: contig of 4944 bp in length 

39750 39849: gap of unknown length 

39850 45296: contig of 5447 bp in length 

45297 45396: gap of unknown length 

45397 51476: contig of 6080 bp in length 

51477 51576: gap of unknown length 

51577 59008: contig of 7432 bp in length 

59009 59108: gap of unknown length 

59109 66218: contig of 7110 bp in length 

66219 66318: gap of unknown length 

66319 76778: contig of 10460 bp in length 

76779 76878: gap of unknown length 

76879 98098: contig of 21220 bp in length 

98099 98198: gap of unknown length 

98199 113987: contig of 15789 bp in length 



FEATURES 

source 



BASE COUNT 
ORIGIN 



113988 
114088 
132116 
132216 
160185 
160285 
186699 
186799 



114087 
132115 
132215 
160184 
160284 
186698 
186798 



gap of unknown 
contig of 18028 
gap of unknown 
contig of 27969 
gap of unknown 
contig of 26414 
gap of unknown 
contig of 48613 



62132 



235411 
Location/Qualifiers 
1. .235411 

/organism="Mus musculus" 
/db_xref="taxon: 10090" 
/clone="RP23-321Dl" 
/clone_lib="RPCI mouse BAC 
a 54445 c 54225 g 61806 t 



length 

bp in length 
length 

bp in length 
length 

bp in length 
length 

bp in length. 



library 23" 
2803 others 



Query Match 10.3%; Score 43.4; DB 2; Length 235411; 

Best Local Similarity 55.7%; Pred. No. 0.26; 

Matches 83; Conservative 0; Mismatches 66; Indels 0; Gaps 0 

Qy 2 aaaaaaataactcggaaaagaaggagacgccgaaaattcgaaaggggaggggaaagcaaa 61 

I I I I I II I I I I I I I I I I I I I II I I I I I I I II I I I I II 
Db 18057 6 AGAAGAAGAAGAAGAAAAAGAAGAAGAAGAAGAAGAAGAGGAGGAGGAGGAGGAGGTGAA 180 635 

Qy 62 gctgatggcggaggcccaggggaaagcaaagcaaatggcggaggccccgagcaagatcga 121 

I I I I I I I I I I I I I I I I I I I I I I III II I I I I I 

Db 180636 GGAGAAGGAGAAGGAGAAGGAGAAAAAGAAGAAGAAGAAGAAGAAGAAGAAGAAGAAGAA 180695 

Qy 122 atccatgaggaagtgggtcgtcgagcaca 150 

I II I I I I I I I I I I I 

Db 180696 GAAGAAGAAGAAGAGAGAGAGAGAGAAGA 180724 



RESULT 11 

AC067840 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

REFERENCE 
AUTHORS 



AC067840 187278 bp DNA HTG 28-MAY-2000 

Homo sapiens chromosome 14 clone RP11-633N4 map 14, WORKING DRAFT 
SEQUENCE, 13 unordered pieces. 
AC067840 

AC067840.2 GI:8099886 

HTG; HTGS_PHASE1; HTGS_DRAFT . 

human . 

Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 1 to 187278) 

Birren,B., Linton, L., Nusbaum,C. and Lander, E. 
Homo sapiens chromosome 14, clone RP11-633N4 
Unpublished 

2 (bases 1 to 187278) 

Birren,B., Linton, L., Nusbaum, C . , Lander, E., Abraham,H., Allen, N., 
Anderson, S., Baldwin, J., Barna,N. f Bastien,V., Beda,F., 
Boguslavkiy, L. , Boukhgalter , B . , Brown, A. , Burkett,G. , 
Campopiano, A. , Castle, A., Choepel,Y., Colangelo, M . , Collins, S., 
Collymore, A. , Cooke, P., DeArellano, K . , Dewar,K., Diaz, J. S., 



Dodge, S., Domino,M., Doyle, M., Ferreira,P., FitzHugh,W., Gage,D., 
Galagan,J., Gardyna,S., Ginde,S., Goyette,M., Graham, L., 
Grand-Pierre, N . , Grant, G., Hagos,B., Heaford,A., Horton,L., 
Howland, J.C. , Iliev,I., Johnson, R., Jones, C, Kann,L., Karatas,A., 
Klein, J., LaRocque,K., Lamazares , R . , Landers, T . , Lehoczky, J . , 
Levine,R., Lieu,C, Liu,G., Locke, K., Macdonald, P . , Marquis , N . , 
McCarthy, M. , McEwan,P., McGurk, A. , McKernan,K., McPheeters, R. , 
Meldrim,J., Meneus,L., Mihova,T., Miranda, C, Mlenga,V., Morrow, J., 
Murphy, T . , Naylor,J., Norman, C . H . , O'Connor, T . , 0 1 Donnell , P , , 
0'Neil,D., 01ivar,T.M., Oliver, J., Peterson, K. , Pierre, N., 
Pisani,C, Pollara,V., Raymond, C, Riley, R. , Rogov,P., Rothman,D., 
Roy, A., Santos, R., Schauer,S., Severy,P., Spencer, B., 
Stange-Thomann, N . , Stojanovic, N . , Subramanian, A. , Talamas,J., 
Tesfaye,S., Theodore, J., Tirrell,A., Travers,M., Trigilio,J., 
Vassiliev, H. , Viel,R., Vo,A. f Wilson, B., Wu,X., Wyman,D., Ye,W.J., 
Young, G., Zainoun,J., Zimmer,A. and Zody,M. 
TITLE Direct Submission 

JOURNAL Submitted (27-APR-2000) Whitehead Institute/MIT Center for Genome 
Research, 320 Charles Street, Cambridge, MA 02141, USA 
COMMENT On May 28, 2000 this sequence version replaced gi: 7651891. 

All repeats were identified using RepeatMasker : 
Smit, A.F.A. & Green, P. (1996-1997) 

http : / /ftp . genome . Washington . edu/RM/RepeatMasker . html 
Genome Center 

Center: Whitehead Institute/ MIT Center for Genome Research 

Center code: WIBR 

Web site: http://www-seq.wi.mit.edu 

Contact: sequence_submissions@genome.wi.mit.edu 

Project Information 

Center project name: L9398 
Center clone name: 633_N_4 

Summary Statistics 

Sequencing vector: M13; M77815; 100% of reads 
Chemistry: Dye-terminator Big Dye; 100% of reads 
Assembly program: Phrap; version 0.960731 
Consensus quality: 180520 bases at least Q40 
Consensus quality: 183792 bases at least Q30 
Consensus quality: 185077 bases at least Q20 
Insert size: 187000; agarose-fp 
Insert size: 186078; sum-of -contigs 
Quality coverage: 5.0 in Q20 bases; agarose-fp 
Quality coverage: 5.0 in Q20 bases; sum-of-contigs 



* NOTE: This is a 'working draft' sequence. It currently 

* consists of 13 contigs. The true order of the pieces 

* is not known and their order in this sequence record is 

* arbitrary. Gaps between the contigs are represented as 

* runs of N, but the exact sizes of the gaps are unknown. 

* This record will be updated with the finished sequence 

* as soon as it is available and the accession number will 

* be preserved. 

* 1 2734: contig of 2734 bp- in length 

* 2735 2834: gap of 100 bp 

* 2835 7097: contig of 4263 bp in length 

* 7098 7197: gap of 100 bp 

* 7198 10127: contig of 2930 bp in length 

* 10128 10227: gap of 100 bp 



FEATURES 

source 



misc feature 



misc feature 



misc feature 



misc feature 



misc feature 



misc feature 



misc feature 



misc feature 



misc feature 



misc feature 



misc feature 



misc feature 



misc feature 



10228 15117: contig of 4890 bp in length 

15118 15217: gap of 100 bp 

15218 24716: contig of 9499 bp in length 

24717 24816: gap of 100 bp * 

24817 32540: contig of 7724 bp in length 

32541 32640: gap of 100 bp 

32641 47120: contig of 14480 bp in length 

47121 47220: gap of 100 bp 

47221 64489: contig of 17269 bp in length 

64490 64589: gap of 100 bp 

64590 82156: contig of 17567 bp in length 

82157 82256: gap of 100 bp 

82257 104149: contig of 21893 bp in length 
104150 104249: gap of 100 bp 

104250 126861: contig of 22612 bp in length 
126862 126961: gap of 100 bp 

126962 152105: contig of 25144 bp in length 
152106 152205: gap of 100 bp 

152206 187278: contig of 35073 bp in length. 
Location/Qualifiers 
1. .187278 

/ organ ism=" Homo sapiens" 
/db_xref-"taxon: 9606" 
/ c h r omo s ome ="14" 
/map="14" 

/clone="RPll-633N4" 

/clone_lib="RPCI-ll Human Male BAC" 
1. .2734 

/note="assembly_f ragment 
clone_end : T7 
vector_side : left" 
2835. .7097 

/note=" as sembly_f ragment 11 
7198. .10127 

/note= " as sembly_f ragment 
clone_end: SP6 
vector_side : left" 
10228. .15117 
/note= " as sembly_f ragment" 
15218. .24716 
/note=" as sembly_f ragment " 
24817. .32540 
/note= " a ssembly_f ragment " 
32641. .47120 
/note= " as sembly_f ragment " 
47221. .64489 
/note= " as sembly_f ragment " 
64590. .82156 
/not e=" as sembly_f ragment " 
82257. .104149 
/note="assembly_f ragment " 
104250. .126861 
/note=" as sembly_f ragment" 
126962. .152105 
/note=" as sembly_f ragment" 
152206. .187278 
/note=" as sembly_f ragment " 



BASE COUNT 56234 a 35308 c 35819 g 58686 t 1231 others 
ORIGIN 



Query Match 10.2%; Score 42.8; DB 2; Length 187278; 

Best Local Similarity 57.5%; Pred. No. 0.38; 

Matches 77; Conservative 0; Mismatches 57; Indels 0; Gaps 0; 



Qy 


2 


aaaaaaataactcggaaaagaaggagacgccgaaaattcgaaaggggaggggaaagcaaa 

1 1 1 1 1 1 1 1 1 1 I 1 1 M II 1 1 1 1 1 1 1 1 1 1 lllll 1 1 1 II 
AGAAGAAAGAAGAGGAGAAGAAGGAGAAGGAGAAGAGAAGGAGGAGGAGGAGGAGGAGAA 


61 


Db 


47386 


47445 


Qy 


62 


gctgatggcggaggcccaggggaaagcaaagcaaatggcggaggccccgagcaagatcga 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 Ml 1 1 1 1 1 1 

GGAGAAGGAGAAGGAGAAGGAGAAGGAGAAGGAGAAGGAGAAGGAGAAGGAGAAGAAGAA 


121 


Db 


47446 


47505 


Qy 


122 


atccatgaggaagt 135 

1 II III 1 
GAAG AAG AAG AAT T 47 519 




Db 


47506 





RESULT 12 
AC025579/C 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 
JOURNAL 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



AC025579 221104 bp DNA HTG 17-JUL-2001 

Mus musculus clone RP23-286A4, WORKING DRAFT SEQUENCE, 4 unordered 

pieces . 

AC025579 

AC02557 9.23 GI: 14 787152 

HTG; HTGS_PHASE1; HTGS_DRAFT; HTGS_FULLTOP; HTGS_ACTIVEFIN . 
house mouse. 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

1 (bases 1 to 221104) 

Metzker, M . L . , Lewis, L.R., Hume, J., Edwards, C, Harris, C, 
Dederich,D., Thomas, S., Okwuonu,G., Carlock,C, Garner, T., 
Addison, S., Pace, A., Williams, G., Bonnin,D., Brooks, A., Brown, J., 
Buhay,C, Bunac,C, Burkett,C, Chacko,J., Chen,G., Chen,Z.,' 
Cox,C, Davis, C, Delgado,0., Ding,Y., Dugan-Rocha , S . , 
Fernandez, C . , Ferraguto, D . , Forcum-Tansey, J. , Gill, R. , 
Gorrell, J.H. , Gunaratne, P . , Haller,G., Hernandez , J . , Hogues,M., 
Hosak,H., Hou,X., Huber,J., Jackson, L., Jia,Y., Kelly, J., Kelly, S., 
Kovar,C, Liu, J., Liu,W., Loulseged, H . , Lozado,R.J., Martin, R., 
Massey,E., McLeod,M.P., Mei,G., Moore, S., Morgan, M., Morris, S., 
Neal,D., Nelson, A. , Nguyen, R., Nguyen, N., Oguh,M., Parish, B., 
Perez, L., Reiter,D., Say, J., Shen,H., Vasquez,L., Watlington, S . , 
Williamson, A. , Wrensf ord, G . , Zhou,X., Bouck,J., Hodgson, A., 
Muzny , D . M . , Rives, M., Scherer,S., Sodergren, E . , Weinstock, G . , 
Worley,K. and Gibbs,R. 
Direct Submission 
Unpublished 

2 (bases 1 to 221104) 
Worley, K.C. 

Direct Submission 

Submitted ( ll-MAR-2000 ) Human Genome Sequencing Center, Department 
of Molecular and Human Genetics, Baylor College of Medicine, One 
Baylor Plaza, Houston, TX 77030, USA 



COMMENT On Jul 17, 2001 this sequence version replaced gi: 14547759. 

Genome Center 

Center: Baylor College of Medicine 
Center code: BCM 

Web site: http://www.hgsc.bcm.tmc.edu/ 
Contact: hgsc-help@bcm.tmc.edu 

Project Information 

Center project name: MABH 
Center clone name: RP23-286A4 

Summary Statistics 

Sequencing vector: M13; L08821 

Chemistry: Dye-primer Bodipy: 53% of reads 

Chemistry: Dye-terminator Big Dye: 47% of reads 

Assembly program: Phrap; version 0.990329 

Consensus quality: 222457 bases at least Q40 

Consensus quality: 223927 bases at least Q30 

Consensus quality: 224923 bases at least Q20 

Estimated insert size: 219940; sum-of-contigs estimation 

Quality coverage: Ox in Q20 bases; agarose-fp estimation 

Quality coverage: 7 . 9x in Q20 bases; sum-of-contigs estimation 



NOTE: Estimated insert size may differ from sequence length 

{ see http : / /www . hgsc . bcm . tmc . edu/docs /Genbank_draf t_dat a . html ) 
NOTE: This is a 1 working draft 1 sequence. It currently 
consists of 4 contigs. The true order of the pieces 
is not known and their order in this sequence record is 
arbitrary. Gaps between the contigs are represented as 
runs of N, but the exact sizes of the gaps are unknown. 
This record will be updated with the finished sequence 
as soon as it is available and the accession number will 
be preserved. 

contig of 67274 bp in length 
gap of unknown length 
contig of 64963 bp in length 
gap of unknown length 
contig of 56387 bp in length 
gap of unknown length 
contig of 32180 bp in length. 



FEATURES 

source 



BASE COUNT 
ORIGIN 



1 

67275 
67375 
132338 
132438 
188825 
188925 



57615 



67274; 
67374: 
132337; 
132437: 
188824: 
188924: 
221104: 
Location/Qualifiers 
1. .221104 

/organism="Mus musculus" 
/db_xref="taxon: 10090" 
/clone="RP23-286A4" 
a 55079 c 54460 g 53650 



300 others 



Query Match 10.1%; Score 42.4; DB 2; Length 221104; 

Best Local Similarity 55.4%; Pred. No. 0.47; 

Matches 82; Conservative 0; Mismatches 66; Indels 0; Gaps 0; 

Qy 5 aaaataactcggaaaagaaggagacgccgaaaattcgaaaggggaggggaaagcaaagct 64 

I I I I II I I I I I I I I I I I I I I I I I I I I II I I I I III 
Db 131331 AAAAAAAAAAGAGGAGGAAGAAGAGGAGGAGAAGGAGAAGGAGAAGGAGAAGGAGAAGGA 131272 

Qy 65 gatggcggaggcccaggggaaagcaaagcaaatggcggaggccccgagcaagatcgaatc 124 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 



Db 131271 GAAGGAGAAGGAGAAGGAGAAGGAGAAGGAGAAGGAGAAGGAGAAGAAGAAGAAGAAGAA 131212 



Qy 125 catgaggaagtgggtcgtcgagcacaag 152 

I I I M I I I Mill! 
Db 131211 GAAGAAGAAGAAGAAGAAGAAGAAGAAG 131184 



RESULT 13 
AC083834/C 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 
JOURNAL 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



COMMENT 



AC083834 241714 bp DNA HTG 27-JUN-2001 

Mus musculus chromosome 11 clone RP23-51J10, WORKING DRAFT 
SEQUENCE, 5 unordered pieces. 
AC083834 

AC083834 .20 GI : 14 54 7775 

HTG; HTGS_PHASE1; HTGS_DRAFT; HTGS_FULLTOP . 
house mouse. 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

1 (bases 1 to 241714) 

Metzker,M.L. , Lewis, L.R., Hume, J., Edwards, C, Harris, C, 
Dederich,D., Thomas, S., Okwuonu, G. , Carlock,C, Garner, T., 
Addison, S., Pace, A., Williams, G., Bonnin,D. f Brooks, A., Brown, J., 
Buhay,C, Bunac,C, Burkett,C, Chacko,J., Chen,G., Chen,Z., 
Cox,C, DaviS/C, Delgado,0., Ding,Y., Dugan-Rocha , S . , 
Fernandez, C. , Ferraguto, D. , Forcum-Tansey, J. , Gill,R. , 
Gorrell, J.H. , Gunaratne, P . , Haller,G., Hernandez , J. , Hogues,M., 
Hosak,H., Hou,X., Huber,J., Jackson, L., Jia,Y., Kelly, J., Kelly, S., 
Kovar,C, Liu, J., Liu,W., Loulseged, H . , Lozado,R.J., Martin, R., 
Massey,E., McLeod,M.P., Mei,G., Moore, S., Morgan, M., Morris, S., 
Neal,D., Nelson, A., Nguyen,R., Nguyen, N., Oguh,M., Parish, B., 
Perez, L., Reiter,D., Say, J., Shen,H., Vasquez,L., Watlington, S . , 
Williamson, A. , Wrensf ord, G . , Zhou,X., Bouck,J., Hodgson, A., 
Muzny, D . M . , Rives,M., Scherer,S., Sodergren, E . , Weinstock, G . , 
Worley,K. and Gibbs,R. 
Direct Submission 
Unpublished 

2 (bases 1 to 241714) 
Worley,K.C. 

Direct Submission 

Submitted { 03-OCT-2000 ) Human Genome Sequencing Center, Department 
of Molecular and Human Genetics, Baylor College of Medicine, One 
Baylor Plaza, Houston, TX 77030, USA 

On Jun 25, 2001 this sequence version replaced gi: 13096022. 
Genome Center 

Center: Baylor College of Medicine 

Center code: BCM 

Web site: http://www.hgsc.bcm.tmc.edu/ 

Contact : hgsc-help@bcm.tmc . edu 
Project Information 

Center project name: MAPR 

Center clone name: RP23-51J10 
Summary Statistics 

Sequencing vector: M13; L08821 

Chemistry: Dye-primer Bodipy: 55% of reads 

Chemistry: Dye-terminator Big Dye: 45% of reads 

Assembly program: Phrap; version 0.990329 



FEATURES 

source 



BASE COUNT 
ORIGIN 



Consensus quality: 239599 bases at least Q40 

Consensus quality: 241588 bases at least Q30 

Consensus quality: 243225 bases at least Q20 

Estimated insert size: 243199; sum-of-contigs estimation 

Quality coverage: Ox in Q20 bases; agarose-fp estimation 

Quality coverage: 8.2x in Q20 bases; sum-of-contigs estimation 



NOTE: Estimated insert size may differ from sequence length 

( see http : //www . hgsc . bcm. tmc . edu/docs /Genbank_draf t_data . html ) 
NOTE: This is a 1 working draft 1 sequence. It currently 
consists of 5 contigs. The true order of the pieces 
is not known and their order in this sequence record is 
arbitrary. Gaps between the contigs are represented as 
runs of N, but the exact sizes of the gaps are unknown. 
This record will be updated with the finished sequence 
as soon as it is available and the accession number will 





be preserved. 








* 


1 


59312: 


contig 


of 59312 bp in 


length 


* 


59313 


59412: 


gap of 


unknown length 




* 


59413 


113658: 


contig 


of 54246 bp in 


length 




113659 


113758: 


gap of 


unknown length 




* 


113759 


165679: 


contig 


of 51921 bp in 


length 




165680 


165779: 


gap of 


unknown length 






165780 


206158: 


contig 


of 40379 bp in 


length 


* 


206159 


206258: 


gap of 


unknown length 




* 


206259 


241714 : 


contig 


of 35456 bp in 


length 



Location/Qualifiers 
1. .241714 

/organism="Mus musculus" 
/db_xref="taxon: 10090" 
/chromosome=" 11" 
/clone="RP23-51J10" 
65862 a 55533 c 55480 g 64438 



401, others 



Query Match 10.1%; Score 42.4; DB 2; Length 241714; 

Best Local Similarity 55.4%; Pred. No. 0.47; 

Matches 82; Conservative 0; Mismatches 66; Indels 0; Gaps 0; 

Qy 5 aaaataactcggaaaagaaggagacgccgaaaattcgaaaggggaggggaaagcaaagct 64 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I Ml 

Db 217714 AAAAAAAAAAG AG G AGG AAG AAG AG G AG G AG AAG G AG AAG G AG AAGG AG AAG G AG AAG G A 217655 

Qy 65 gatggcggaggcccaggggaaagcaaagcaaatggcggaggccccgagcaagatcgaatc 124 

II M I III I I I I II I I I I I I I I I I I I II I I I I I 

Db 217 654 G AAG GAG AAGG AG AAG GAGAAGG AG AAG GAG AAGG AG AAG GAG AAG AAG AAG AAG AAG AA 217595 

Qy 125 catgaggaagtgggtcgtcgagcacaag 152 

I I I I I I I I I I I I I I 

Db 217 594 GAAGAAGAAGAAGAAGAAGAAGAAGAAG 217 567 



RESULT 14 
AC092197 

LOCUS AC092197 143977 bp DNA HTG 27-JUL-2001 

DEFINITION Homo sapiens chromosome 3q clone RP11-67H6, WORKING DRAFT SEQUENCE, 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 
JOURNAL 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



COMMENT 



HTGS FULLTOP. 



Craniata; Vertebrata; Euteleostomi ; 
Catarrhini; Hominidae; Homo. 



11 unordered pieces. 
AC092197 

AC092197.5 GI:15011619 
HTG; HTGS_PHASE1; HTGS_DRAFT; 
human . 

Homo sapiens 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Primates; 

1 (bases 1 to 143977) 

Muzny, D . M . , Adams, C, Adio-Oduola, B . , Ali-osman, F. R. , Allen, C, 
Alsbrooks,S.L. , Amaratunge, H . C . , Are,J.R., Banks, T., Barbaria,J., 
Benton, J., Bimage,K., Blankenburg, K. , Bonnin,D., Bouck,J., 
Bowie, S., Brieva,M., Brown, E., Brown, M., Bryant, N. P., Buhay,C, 
Burch,P., Burkett,C, Burrell, K. L . , Byrd,N.C, Carron,T.F., 
Carter, M., Cavazos , S . R. , Chacko,J., Chavez, D., Chen,G., Chen,R., 
Chen,Z., Chowdhry,I., Christopoulos , C . , Cleveland, C . D . , Cox,C, 
Coyle,M.D., Dathorne, S . R. , David, R. , Davila,M.L., Davis, C, 
Davy-Carroll, L. , Dederich, D. A. , Delaney, K. R. , Delgado, 0. , 
Denn,A.L., Ding,Y., Dinh,H.H., Douthwaite, K . J . , Draper, H., 
Dugan-Rocha, S . , Durbin,K.J., Earnhart,C, Edgar, D., Edwards , C . C . , 
Elhaj,C, Escotto,M., Falls, T., Ferraguto, D . , Flagg,N., Ford, J., 
Foster, P., Frantz,P., Gabisi,A., Gao,J., Garcia, A., Garner, T., 
Garza,N., Gill,R., Gorrell, J. H . , Guevara, W., Gunaratne, P . , Hale,S., 
Hamilton, K., Harris, C, Harris, K., Hart,M., Havlak,P., Hawes,A., 
Hernandez, J. , Hernandez , 0 . , Hodgson, A., Hogues,M., Holloway,C, 
Hollins,B., Homsi,F., Howard, S., Huber,J., Hulyk,S., Hume, J., 
Jackson, L.E. , Jacobson,B., Jia,Y., Johnson, R., Jolivet,S., 
Joudah,S., Karlsson,E., Kelly, S., Khan,U., King,L., Korvah,J., 
Kovar,C, Kratovic,J., Kureshi,A., Landry, N., Leal,B., Lewis, L.C., 
Lewis, L., Li, J., Li,Z., Lichtarge , 0 . , Lieu,C, Liu, J., Liu,W., 
Loulseged, H . , Lozado,R.J., Lu,X., Lucier,A., Lucier,R., Luna,R., 
Ma, J., Maheshwari, M. , Mapua,P., Martin, R. , Martindale , A . , 
Martinez , E . , Massey,E., Mawhiney,E., McLeod,M.P., Meador,M., 
Mei,G., Metzker,M., Miner, G., Miner, Z., Mitchell, T., Mohabbat,K., 
Morgan, M . , Morris, S., Moser,M., Neal,D., Newtson,J., Newtson,N., 
Nguyen, A., Nguyen, N . , Nguyen, N . , Nickerson, E . , Nwokenkwo, S . , 
Oguh,M., 0kwuonu,G., 0ragunye,N., Oviedo,R., Pace, A., Payton,B., 
Peery,J., Perez, L., Peters, L., Pickens, R., Primus, E., Pu,L.L., 
Quiles,M., Ren,Y., Rives, M., Rojas,A., Rojubokan, I . , Rolfe,M., 
Ruiz,S., Savery,G., Scherer,S., Scott, G., Shen,H., Shooshtari , N . , 
Sisson,I., Sodergren, E . , Sonaike,T., Sparks, A., Stanley, H., 
Stone, H., Sutton, A., Svatek,A., Tabor, P., Tamerisa,A., Tamerisa,K., 
Tang,H., Tansey,J., Taylor, C, Taylor, T., Telfrod,B., Thomas, N., 
Thomas, S., Usmani,K., Vasquez,L., Vera,V., Villalon,D., Vinson, R., 
Wall,R., Wang,S., Ward-Moore, S . , Warren, R., Washington, C . , 
Watlington, S . , Williams, G., Williamson, A. , Wleczyk,R., Wooden, S., 
Worley,K., Wu,C, Wu,Y., Wu,Y.F., Zhou, J., Zorrilla,S., Nelson, D . , 
Weinstock,G. and Gibbs,R. 
Direct Submission 
Unpublished 

2 (bases 1 to 143977) 
Worley,K.C. 

Direct Submission 

Submitted { 27- JUN-2001 ) Human Genome Sequencing Center, Department 
of Molecular and Human Genetics, Baylor College of Medicine, One 
Baylor Plaza, Houston, TX 77030, USA 

On Jul 25, 2001 this sequence version replaced gi : 14971137. 



FEATURES 

source 



Genome Center 

Center: Baylor College of Medicine 
Center code: BCM 

Web site: http://www.hgsc.bcm.tmc.edu/ 

Contact: hgsc-help@bcm.tmc.edu 
Project Information 

Center project name: HCHC 

Center clone name: RP11-67H6 
Summary Statistics 

Sequencing vector: Plasmid; M77789 

Chemistry: Dye-terminator Big Dye: 

Assembly program: Phrap; version 0, 

Consensus quality: 144495 bases at least Q40 

Consensus quality: 146471 bases at least Q30 

Consensus quality: 148226 bases at least Q20 

Estimated insert size: 144773; sum-of-contigs estimation 

Quality coverage: Ox in Q20 bases; agarose-fp estimation 

Quality coverage: 8 . 8x in Q20 bases; sum-of-contigs estimation 



100% of reads 
, 990329 



BASE COUNT 



NOTE: Estimated insert size may differ from sequence length 

(see http : //www . hgsc . bcm. tmc . edu/docs/Genbank_draf t_data . html ) 
NOTE: This is a 'working draft 1 sequence. It currently 
consists of 11 contigs. The true order of the pieces 
is not known and their order in this sequence record is 
arbitrary. Gaps between the contigs are' represented as 
runs of N, but the exact sizes of the gaps are unknown. 
This record will be updated with the finished sequence 
as soon as it is available and the accession number will 



contig of 24556 bp in length 
gap of unknown length 
contig of 23563 bp in length 
gap of unknown length 
contig of 21073 bp in length 
gap of unknown length 
contig of 19295 bp in length 
gap of unknown length 
contig of 11005 bp in length 
gap of unknown length 
contig of 11049 bp in length 
gap of unknown length 
contig of 7410 bp in length 
gap of unknown length 
contig of 9423 bp in length 
gap of unknown length 
contig of 7670 bp in length 
gap of unknown length 
contig of 5559 bp in length 
gap of unknown length 
contig of 2374 bp in length. 



/organism="Homo sapiens" 
/db_xref="taxon: 9606" 
/chromosome="3q" 
/clone="RPll-67H6" 
43520 a 27330 c 27935 g 44191 t 1001 others 



* 


be preserved. 




1 


24556 




24557 


24656 




24657 


48219 




48220 


48319 


* 


48320 


69392 




69393 


69492 




69493 


88787 




88788 


88887 


* 


88888 


99892 




99893 


99992 


* 


99993 


111041 




111042 


111141 


* 


111142 


118551 




118552 


118651 




118652 


128074 




128075 


128174 


* 


128175 


135844 




135845 


135944 




135945 


141503 




141504 


141603 


* 


141604 


143977 




Location/Q 




1. 


.143977 



ORIGIN 



Query Match 10.0%; Score 42; DB 2; Length 143977; 

Best Local Similarity 56.5%; Pred. No. 0.63; 

Matches 78; Conservative 0; Mismatches 60; Indels 0; Gaps 0; 

Qy 1 gaaaaaaataactcggaaaagaaggagacgccgaaaattcgaaaggggaggggaaagcaa 60 

II I II I I I I I I I I I I I I I I I I II I I I I I I II 
Db 24 385 GG AG GAAAG AAAAAG AAG AAAAAAG AAG AG AAGGGAT G AAG AGAAG GG AG GAG AG GAGAA 24444 

Qy 61 agctgatggcggaggcccaggggaaagcaaagcaaatggcggaggccccgagcaagatcg 120 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III I I 
Db 244 4 5 AGAGGATGGAGGAGAGAGAGGGGAGAGAAAAGAGAAAGGGGGAAGGAGGGAGAGGGGACA 24 504 

Qy - 121 aatccatgaggaagtggg 138 

II I II I I I I I I 
Db 24 505 AAGGGAAGAGGGAGAGAG 2 4 522 



RESULT 15 
AC007426/C 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

REFERENCE 
AUTHORS 



TITLE 
JOURNAL 

COMMENT 



AC007426 84199 bp DNA HTG 24-MAR-2000 

Homo sapiens clone RP5-241A3, WORKING DRAFT SEQUENCE, 12 unordered 
pieces . 
AC007426 

AC007426.2 GI:7321581 

HTG; HTGS_PHASE1; HTGS_DRAFT . 

human. 

Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 1 to 84199) 

Birren,B., Linton, L., Nusbaum,C. and Lander, E. 

Homo sapiens, clone RP5-241A3 

Unpublished 

2 (bases 1 to 84199) 

Birren,B., Linton, L., Nusbaum,C, Lander, E., Allen, N., Anderson, M., 
Baker, J., Baldwin, J., Barna,N., Beckerly,R., Benn,J., Brown, A., 
Castle, A., Cerny,J., Colangelo, M . , Collins,S., Collymore, A. , 
Cooke, P., DeArellano, K. , Depayre,E., Devon, K., Dewar,K., 
Donelan,L., Doyle, M., Ferreira,P., FitzHugh,W., Forrest, C, 
Funke,R., Gage,D., Galagan,J., Gardyna,S., Gilbert, D., Grant, G., 
Hagos,B., Heaford,A., Horton,L., Howland, J . C . , Jones, C, Kann,L., 
Karatas,A., Lehoczky,J., Lieu,C, Locke, K., Macdonald, P . , 
Marquis,N., McEwan,P., McGurk, A. , McKernan,K., McLaughlin, J . , 
Meldrim,J., Molla,M., Morris, W., Morrow, J., Mychaleckyj , J. , 
Naylor,J., Niloff,M., O'Connor, T., O 1 Donnell, P . , Pavlin,B., 
Peterson, K., Pollara,V., Riley, R. , Roberts, D., Roy, A., Severy,P., 
Stange-Thomann,N. , Stojanovic, N . , Stone, C. , Subramanian, A. , 
Tesfaye,S., Torruella-Miller , I . , Vassiliev, H . , Vo,A., Wagner, A., 
Wheeler,J., Wu,X., Wyman,D., Ye, W.J. and Zody,M. 
Direct Submission 

Submitted ( 29-APR-1999 ) Whitehead Institute/MIT Center for Genome 

Research, 320 Charles Street, Cambridge, MA 02141, USA 

On Mar 24, 2000 this sequence version replaced gi: 4713974. 

All repeats were identified using RepeatMasker : 



Smit, A.F.A. & Green, P. (1996-1997) 

http : //ftp. genome .Washington. edu/RM/RepeatMasker . html 
Genome Center 

Center: Whitehead Institute/ MIT Center for Genome Research 

Center code: WIBR 

Web site: http://www-seq.wi.mit.edu 

Contact : sequence_submissions@genome . wi .mit . edu 
Project Information 

Center project name: L578 

Center clone name: 241_A_3 
Summary Statistics 

Sequencing vector: M13; M77815; 96% of reads 

Sequencing vector: Plasmid; n/a; %-0.f%% of reads 

3 . 5036496350365Chemistry : Dye-primer-amersham; 96% of reads 

Chemistry: Dye-terminator Big Dye; 4% of reads 

Assembly program: Phrap; version 0.960731 

Consensus quality: 76051 bases at least Q40 

Consensus quality: 79580 bases at least Q30 

Consensus quality: 81167 bases at least Q20 

Insert size: 83000; agarose-fp 

Insert size: 83099; sum-of-contigs 

Quality coverage: 4.1. 

* NOTE: This is a 'working draft 1 sequence. It currently 

* consists of 12 contigs. The true order of the pieces 

* is not known and their order in this sequence record is 

* arbitrary. Gaps between the contigs are represented as 

* runs of N, but the exact sizes of the gaps are unknown. 

* This record will be updated with the finished sequence 

* as soon as it is available and the accession number will 



* be preserved. 





1 


1147: contig 


of 1147 bp 


in 


length 




1148 


1247 : gap of 


100 bp 








1248 


2658: contig 


of 1411 bp 


in 


length 




2659 


2758: gap of 


100 bp 








2759 


3896: contig 


of 1138 bp 


in 


length 




3897 


3996: gap of 


100 bp 








3997 


5111: contig 


of 1115 bp 


in 


length 


* 


5112 


5211: gap of 


100 bp 








5212 


8025: contig 


of 2814 bp 


in 


length 


* 


8026 


8125: gap of 


100 bp 








8126 


10951: contig 


of 2826 bp 


in 


length 




10952 


11051: gap of 


100 bp 






★ 


11052 


15725: contig 


of 4674 bp 


in 


length 


★ 


15726 


15825: gap of 


100 bp 






* 


15826 


21857: contig 


of 6032 bp 


in 


length 


★ 


21858 


21957: gap of 


100 bp 






* 


21958 


28985: contig 


of 7028 bp 


in 


length 


* 


28986 


29085: gap of 


100 bp 







* 29086 42641: contig of 13556 bp in length 

* 42642 42741: gap of 100 bp 

* 42742 66388: contig of 23647 bp in length 

* 66389 66488: gap of 100 bp 

* 66489 84199: contig of 17711 bp in length. 
FEATURES Location/Qualifiers 

source 1. .84199 

/organism="Homo sapiens" 
/db xref="taxon:9606" 



/clone="RP5-241A3" 

/clone_lib="RPCI Human Male PAC" 
misc_f eature 1, .1147 

/note="assembly_f ragment" 
misc_feature 1248. .2658 

/note=" as sembly_f ragment" 
misc_feature 2759. .3896 

/note= n as sembly_f ragment" 
misc_feature 3997. .5111 

/note=" as sembly_f ragment" 
misc_feature 5212. .8025 

/note=" as sembly_f ragment " 
misc_feature 8126. .10951 

/note- " as sembly_f ragment" 
misc_feature 11052. .15725 

/note= " as sembly_f ragment" 
misc_feature 15826. .21857 

/note= " as sembly_f ragment" 
misc_feature 21958. .28985 

/note= " as sembly_f ragment" 
misc_feature 29086. .42641 

/note- " as sembly_f ragment " 
misc_feature 42742. .66388 

/not e=" as sembly_f ragment" 
misc_feature 66489. .84199 

/note=" as sembly__f ragment 

clone__end : SP6 

vector_side : right" 
BASE COUNT 24761 a 14984 c 15272 g 28081 t 1101 others 
ORIGIN 



Query Match 9.9%; Score 41.8; DB 2; Length 84199; 

Best Local Similarity 57.1%; Pred. No. 0.75; 

Matches 76; Conservative 0; Mismatches 57; Indels 0; Gaps 0; 



Qy 


2 


aaaaaaataactcggaaaagaaggagacgccgaaaattcgaaaggggaggggaaagcaaa 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 II 
AGAAGAAGAAGAAGAAGAAGAACAAGAAGAGGAAGAAGAAAGAGAGAAGGAGAAGGAGAA 


61 


Db 


19791 


19732 


Qy 


62 


gctgatggcggaggcccaggggaaagcaaagcaaatggcggaggccccgagcaagatcga 

1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 II 1 1 1 1 1 
GGAGAAGGAGAAGGAGAAGGAGAAGGAGAAGGAGAAGGAGAAGGAGAAGAAGAAGAAGAA 


121 


Db 


19731 


19672 


Qy 


122 


atccatgaggaag 134 

1 II 1 1 1 1 
GAAGAAGAAGAAG 19659 




Db 


19671 





Search completed: February 7, 2002, 10:51:11 
Job time: 8997 sec 



GenCore version 4.5 
Copyright (c) 1993 - 2000 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 
Run on: February 7, 2002, 09:36:54 



r Search time 428.31 Seconds 
(without alignments) 
842.693 Million cell updates/sec 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



Searched: 



US-09-394-745-5893 
421 

1 gaaaaaaataactcggaaaa . 



. ccatgttggttcctgcatgc 421 



IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 

930621 seqs, 428662619 residues 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 



1861242 



Post-processing : 



Minimum Match 0% 
Maximum Match 100% 
Listing first 45 summaries 



Database 



N Geneseq 1101: 



9 

10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 



/SIDS2/gcgdata/geneseq/geneseqn/NA1980 . 

/SIDS2/gcgdata/geneseq/geneseqn/NA1981 . 

/SIDS2/gcgdata/geneseq/geneseqn/NA1982 . 

/SIDS2/gcgdata/geneseq/geneseqn/NA1983 . 

/SIDS2/gcgdata/geneseq/geneseqn/NA1984 . 

/SIDS2/gcgdata/geneseq/geneseqn/NA1985 . 
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Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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PR 


26 


-OCT- 


1999, 


99US- 


0161360. 


PR 


26 


-OCT- 


1999, 


99US- 


0161361. 


PR 


28 


-OCT- 


1999, 


99US- 


0161920. 


PR 


28 


-OCT- 


1999, 


99US- 


0161992. 


PR 


28 


-OCT- 


1999, 


99US- 


0161993. 


PR 


29 


-OCT- 


1999, 


99US- 


0162142. 



Query Match 28.5%; Score 119.8; DB 21; Length 431; 

Best Local Similarity 67.3%; Pred. No 1 . 6e-29; 

Matches 169; Conservative 0; Mismatches- 82; Indels 0; Gaps 0; 

Qy 95 aatggcggaggccccgagcaagatcgaatccatgaggaagtgggtcgtcgagcacaagct 154 

I I I M I I I I I II II III I I I I I I I I I I I I II I I I I I I I I 
Db 31 aatggcggaaccaaagacaaaagttgcagaaatcagggaatggatcatcgaacataagct 90 

Qy 155 ccgagccgtaggttgcctctggctaggtgggatcagcagttcgatcgcctacaactggtc 214 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 91 tcgtaccgttggttgcttatggctaagtggtatctctggttcaattgcttataattggtc 150 

Qy 215 gcggcccaatatgaagcctagcgtcaagatcatccacgcaaggttgcatgctcaagctct 27 4 

II I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I 
Db 151 taaacctgccatgaaaaccagtgtcagaatcatccacgctaggttgcatgctcaggcgct 210 

Qy 275 aaccctggctgcattagttggttctgcatgcgtggagtactatgaccagaagtatggttc 334 

II I I I I I I I I I I III II I I I I I I I I I I I I II I I I I II I 
Db 211 gacattagccgctctggctggagcagctgcagtggagtactatgatcacaaatctggagc 270 

Qy 335 ttctgggccaa 345 

III III 
Db 271 cactgatcgaa 281 



RESULT 2 
AAZ86967/C 

ID AAZ86967 standard; DNA; 162450 BP. 
XX 

AC AAZ86967; 
XX 

DT 16-MAY-2000 (first entry) 



XX 

DE Retinoblastoma binding protein-7 genomic DNA sequence. 
XX 

KW RBP-7; retinoblastoma binding protein-7; abnormal cell proliferation; 

KW diagnosis; therapy; cell differentiation; thyroid hyperplasia; psoriasis; 

KW benign prostate hypertrophy; cancer; sarcoma; neoplasm; leukaemia; 

KW lymphoma; .ds. 

XX 

OS Homo sapiens. 
XX 

PN WO200000607-A1 . 
XX 

PD 06-JAN-2000. 
XX 

PF 30-JUN-1999; 99WO-IB01242 . 
XX 

PR 30-JUN-1998; 98US-0091315 . 

PR 10-DEC-1998; 98US-0111909 . 
XX 

PA (GEST ) GENSET. 
XX 

PI Bougueleret L; 
XX 

DR WPI; 2000-117170/10. 
XX 

PT Novel nucleic acid and polymorphic markers used for diagnosis of 

PT diseases, especially those involving abnormal cell proliferation and 

PT differentiation - 

XX 

PS Claim 1; Page 118-163; 223pp; English. 
XX 

CC This sequence represents the retinoblastoma binding protein-7 (RBP-7) 

CC genomic sequence of the invention. The RBP-7 coding sequence and 

CC regulatory sequences are useful for the recombinant production of the 

CC protein and for expressing heterologous nucleic acids. Primers and 

CC probes derived from the RBP-7 nucleotide sequence (e.g. AAZ87035-Z87099) 

CC are useful for DNA amplification and detection methods. RBP-7 biallelic 

CC markers (see AAZ8 6993-Z87 034 ) are useful for diagnosis of disease 

CC related to alteration in the regulation or in the coding regions of the 

CC RBP-7 gene and for prognosis/diagnosis of an eventual treatment with 

CC therapeutic agents, especially agents acting on pathologies involving 

CC abnormal cell proliferation and/or differentiation, these include 

CC thyroid hyperplasia, psoriasis, benign prostate hypertrophy, cancers, 

CC including breast cancer, sarcomas and other neoplasms, bladder cancer, 

CC colon cancer, lung cancer, prostate cancer, various leukaemias, and 

CC lymphomas. RBP-7 antibodies are useful as diagnostic agents. 

XX 

SQ Sequence 162450 BP; 45465 A; 30661 C; 32637 G; 53673 T; 14 other; 



Query Match 9.4%; Score 39.4; DB 21; Length 162450; 

Best Local Similarity 55.5%; Pred. No. 0.089; 

Matches 76; Conservative 0; Mismatches 61; Indels 0; Gaps 0; 

Qy 1 gaaaaaaataactcggaaaagaaggagacgccgaaaattcgaaaggggaggggaaagcaa 60 

I I I I I I I I I I I I I I Mill I II I I I I I I I I II I I I I 
Db 2 6250 GCAAAAAAAAACAAAGAAACAAGGAAAAGGAAGGGAAGGGGAAGGGAAAGGGGAAGGGGA 2 6191 



Qy 61 agctgatggcggaggcccaggggaaagcaaagcaaatggcggaggccccgagcaagatcg 120 

I I I I I I I II I I I I I II III I I I I I I I I I I 

Db 2 6190 AAGAGATGGGGAGGGGGAAGTGTAAAGGGAGGGAGGGGAGGAAGGAAGGAAGGAAGGAAA 2 6131 

Qy 121 aatccatgaggaagtgg 137 

II I II I I I I II 

Db 2 6130 AAGGGAAAGGGAAGTGG 2 6114 



RESULT 3 
AAI26833 

ID AAI26833 standard; DNA; 244 BP. 
XX 

AC AAI26833; 
XX 

DT 12-OCT-2001 (first entry) 
XX 

DE Probe #16766 for gene expression analysis in human cervical cell sample. 
XX 

KW Probe; human; microarray; gene expression; cervical epithelial cell; 

KW cervical cancer; ss. 

XX 

OS Homo sapiens . 
XX 

PN WO200157278-A2 . 
XX 

PD 09-AUG-2001. 

XX 

PF 30-JAN-2001; 2001WO-US00670 . 
XX 

PR 04-FEB-2000; 2000US-0180312 . 

PR 26-MAY-2000; 2000US-0207 4 5 6 . 

PR 30-JUN-2000; 2000US-0608408 . 

PR 03-AUG-2000; 2000US-0632366 . 

PR 21-SEP-2000; 2000US-0234 687 . 

PR 27-SEP-2000; 2000US-0236359 . 

PR 04-OCT-2000; 2000GB-00242 63 . 
XX 

PA (MOLE-) MOLECULAR DYNAMICS INC. 
XX 

PI Penn SG, Hanzel DK, Chen W, Rank DR; 
XX 

DR WPI; 2001-488901/53. 
XX 

PT Human genome-derived single exon nucleic acid probes useful for 

PT analyzing gene expression in human cervical epithelial cells - 
XX 

PS Claim 25; SEQ ID No 16766; 487pp; English. 
XX 

CC The present invention relates to human single exon nucleic acid probes 

CC (SENP) . The present sequence is one such probe. The SENPs are derived 

CC from human HeLa cells. The SENPs can be used to produce a single exon 

CC microarray, which can be used for measuring human gene expression in a 

CC sample derived from human cervical epithelial cells. By measuring gene 

CC expression, the probes are therefore useful in grading and/or staging 

CC of diseases of the cervix, notably cervical cancer. 



CC Note: The sequence data for this patent did not form part of the printed 

CC specification, but was obtained in electronic format directly from WIPO 

CC at ftp . wipo . int/pub/published_pct_sequences . 
XX 

SQ Sequence 244 BP; 101 A; 25 C; 81 G; 37 T; 0 other; 



Query Match 9.0%; Score 38; DB 22; Length 244 ; 

Best Local Similarity 60.8%; Pred. No. 0.014; 

Matches 62; Conservative 0; Mismatches 40; Indels 0; Gaps 0; 

Qy 4 aaaaataactcggaaaagaaggagacgccgaaaattcgaaaggggaggggaaagcaaagc 63 

I I I I I I! I I I I I I I I I I I I I I I I I III I I I I I I I I I I I I I 
Db 12 9 aaaaagaaagagagaaagaaggagaaggagaagaagaggaacaggaaggggaagaaaaga 188 

Qy 64 tgatggcggaggcccaggggaaagcaaagcaaatggcggagg 105 

III I I I I I I I I I I III I I I I I I 

Db 189 agaagaagaagacgaagaggaagaaggaggaggaggaggaag 230 



RESULT 4 
AAI55620 

ID AAI55620 standard; DNA; 244 BP. 
XX 

AC AAI55620; 
XX 

DT 17-OCT-2001 (first entry) 
XX 

DE Probe #24306 used to measure gene expression in human placenta sample. 
XX 

KW Probe; microarray; human; placenta; antenatal diagnosis; 

KW genetic disorder; ss. 

XX 

OS Homo sapiens. 
XX 

PN WO200157272-A2 . 
XX 

PD 09-AUG-2001. 
XX 

PF 30-JAN-2001; 2001WO-US00663 . 
XX 

PR 04-FEB-2000; 2000US-0180312 . 

PR 26-MAY-2000; 2000US-0207 456 . 

PR 30-JUN-2000; 2000US-0608408 . 

PR 03-AUG-2000; 2000US-0632366 . 

PR 21-SEP-2000; 2000US-0234 687 . 

PR 27-SEP-2000; 2000US-0236359 . 

PR 04-OCT-2000; 2000GB-0024263 . 
XX 

PA (MOLE-) MOLECULAR DYNAMICS INC. 
XX 

PI Penn SG, Hanzel DK, Chen W, Rank DR; 
XX 

DR WPI; 2001-488897/53. v 
XX 

PT Human genome-derived single exon nucleic acid probes useful for 

PT analyzing gene expression in human placenta - 



XX 

PS Claim 25; SEQ ID No 24306; 654pp; English. 
XX 

CC The present invention relates to single exon nucleic acid probes (SENP) 

CC The present sequence is one such probe. The probes are useful for 

CC producing a microarray for predicting, measuring and displaying gene 

CC expression in samples derived from human placenta. The probes are usefu 

CC for antenatal diagnosis of human genetic disorders. 
XX 

SQ Sequence 244 BP; 101 A; 25 C; 81 G; 37 T; 0 other; 



Query Match 9.0%; Score 38; DB 22; Length 244; 

Best Local Similarity 60.8%; Pred. No. 0.014; 

Matches 62; Conservative 0; Mismatches 40; Indels 0; Gaps 

Qy 4 aaaaataactcggaaaagaaggagacgccgaaaattcgaaaggggaggggaaagcaaagc 63 

I I II I I I I I I I I I I I I I I I I I I I I III I I I I II I I I I I I I 
Db 12 9 aaaaagaaagagagaaagaaggagaaggagaagaagaggaacaggaaggggaagaaaaga 188 

Qy 64 tgatggcggaggcccaggggaaagcaaagcaaatggcggagg 105 

III I I I I II I I I I III I I I I I I 

Db 189 agaagaagaagacgaagaggaagaaggaggaggaggaggaag 230 



RESULT 5 
AAI17628 

ID AAI17628 standard; DNA; 597 BP. 
XX 

AC AAI17628; 
XX 

DT 12-OCT-2001 (first entry) 
XX 

DE Probe #7561 for gene expression analysis in human cervical cell sample. 
XX 

KW Probe; human; microarray; gene expression; cervical epithelial cell; 

KW cervical cancer; ss . 

XX 

OS Homo sapiens. 
XX 

PN WO200157278-A2 . 
XX 

PD 09-AUG-2001 . 
XX 

PF 30-JAN-2001; 2001WO-US00670 . 
XX 

PR 04-FEB-2000; 2000US-0180312 . 

PR 26-MAY-2000; 2000US-0207 4 56 . 

PR 30-JUN-2000; 2000US-0608408 . 

PR 03-AUG-2000; 2000US-0632366 . 

PR 21-SEP-2000; 2000US-0234 687 . 

PR 27-SEP-2000; 2000US-0236359 . 

PR 04-OCT-2000; 2000GB-00242 63 . 
XX 

PA (MOLE-) MOLECULAR DYNAMICS INC. 
XX 

PI Penn SG, Hanzel DK, Chen W, Rank DR; 



XX 

DR WPI; 2001-488901/53. 
XX 

PT Human genome -de rived single exon nucleic acid probes useful for 

PT analyzing gene expression in human cervical epithelial cells - 
XX 

PS Claim 25; SEQ ID No 7561; 487pp; English. 
XX 

CC The present invention relates to human single exon nucleic acid probes 

CC (SENP) . The present sequence is one such probe. The SENPs are derived 

CC from human HeLa cells. The SENPs can be used to produce a single exon 

CC microarray, which can be used for measuring human gene expression in a 

CC sample derived from human cervical epithelial cells. By measuring gene 

CC expression, the probes are therefore useful in grading and/or staging 

CC of diseases of the cervix, notably cervical cancer. 

CC Note: The sequence data for this patent did not form part of the printed 

CC specification, but was obtained in electronic format directly from WIPO 

CC at ftp. wipo. int/pub/published_pct_sequences . 
XX 

SQ Sequence 597 BP; 228 A; 67 C; 170 G; 132 T; 0 other; 



Query Match 9.0%; Score 38; DB 22; Length 597; 

Best Local Similarity 60.8%; Pred. No. 0.021; 

Matches 62; Conservative 0; Mismatches 40; Indels 0; Gaps 0; 

Qy 4 aaaaataactcggaaaagaaggagacgccgaaaattcgaaaggggaggggaaagcaaagc 63 

I I I I I I I I I I I I I I I I I I I I I I I I III Ml III Ml MM 
Db 38 9 aaaaagaaagagagaaagaaggagaaggagaagaagaggaacaggaaggggaagaaaaga 448 

Qy 64 tgatggcggaggcccaggggaaagcaaagcaaatggcggagg 105 

III I I I I I I II II III I I I II I 

Db 449 agaagaagaagacgaagaggaagaaggaggaggaggaggaag 490 



RESULT 6 
AAI42551 

ID AAI42551 standard; DNA; 597 BP. 
XX 

AC AAI42551; 
XX 

DT 17-OCT-2001 (first entry) 
XX 

DE Probe #11237 used to measure gene expression in human placenta sample. 
XX 

KW Probe; microarray; human; placenta; antenatal diagnosis; 

KW genetic disorder; ss. 

XX 

OS Homo sapiens. 
XX 

PN WO200157272-A2. 
XX 

PD 09-AUG-2001. 
XX 

PF 30-JAN-2001; 2001WO-US00663 . 
XX 

PR 04-FEB-2000; 2000US-0180312 . 



PR 26-MAY-2000; 2000US-0207456 . 

PR 30-JUN-2000; 2000US-0608408 . 

PR 03-AUG-2000; 2000US-0632366 . 

PR 21-SEP-2000; 2000US-0234 687 . 

PR 27-SEP-2000; 2000US-0236359 . 

PR 04-OCT-2000; 2000GB-0024263 . 
XX 

PA (MOLE- ) MOLECULAR DYNAMICS INC. 
XX 

PI Penn SG, Hanzel DK, Chen W, Rank DR; 
XX 

DR WPI; 2001-488897/53. 
XX 

PT Human genome-derived single exon nucleic acid probes useful for 

PT analyzing gene expression in human placenta - 

XX 

PS Claim 25; SEQ ID No 11237; 654pp; English. 
XX 

CC The present invention relates to single exon nucleic acid probes (SENP) . 

CC The present sequence is one such probe. The probes are useful for 

CC producing a microarray for predicting, measuring and displaying gene 

CC expression in samples derived from human placenta. The probes are useful 

CC for antenatal diagnosis of human genetic disorders. 

XX 

SQ Sequence 597 BP; 228 A; 67 C; 170 G; 132 T; 0 other; 



Query Match 9.0%; Score 38; DB 22; Length 597; 

Best Local Similarity 60.8%; Pred. No. 0.021; 

Matches 62; Conservative 0; Mismatches 40; Indels 0; Gaps 0; 

Qy 4 aaaaataactcggaaaagaaggagacgccgaaaattcgaaaggggaggggaaagcaaagc 63 

I 11 I I I I I I I I I I I I I I I I I I I I I III I I I I I I I I I I I I I 
Db 389 aaaaagaaagagagaaagaaggagaaggagaagaagaggaacaggaaggggaagaaaaga 448 

Qy 64 tgatggcggaggcccaggggaaagcaaagcaaatggcggagg 105 

111 I I I I I I I I I I III I I I I I I 

Db 449 agaagaagaagacgaagaggaagaaggaggaggaggaggaag 4 90 



RESULT 7 
AAI54321 

ID AAI54321 standard; DNA; 196 BP. 
XX 

AC AAI54321; 
XX 

DT 17-OCT-2001 (first entry) 
XX 

DE Probe #23007 used to measure gene expression in human placenta sample. 
XX 

KW Probe; microarray; human; placenta; antenatal diagnosis; 

KW genetic disorder; ss. 

XX 

OS Homo sapiens . 
XX 

PN WO200157272-A2. 
XX 



PD 09-AUG-2001. 
XX 

PF 30-JAN-2001; 2001WO-US00663 . 
XX 

PR 04-FEB-2000; 2000US-0180312 . 

PR 26-MAY-2000; 2000US-0207456 . 

PR 30-JUN-2000; 2000US-0608408 . 

PR 03-AUG-2000; 2000US-0 632366 . 

PR 21-SEP-2000; 2000US-0234 687 . 

PR 27-SEP-2000; 2000US-0236359 . 

PR 04-OCT-2000; 2000GB-0024263 . 
XX 

PA (MOLE-) MOLECULAR DYNAMICS INC. 
XX 

PI Penn SG, Hanzel DK, Chen W, Rank DR; 
XX 

DR WPI; 2001-488897/53. 
XX 

PT Human genome-derived single exon nucleic acid probes useful for 

PT analyzing gene expression in human placenta - 

XX 

PS Claim 25; SEQ ID No 23007; 654pp; English. 
XX 

CC The present invention relates to single exon nucleic acid probes (SENP) . 

CC The present sequence is one such probe. The probes are useful for 

CC producing a microarray for predicting, measuring and displaying gene 

CC expression in samples derived from human placenta. The probes are useful 

CC for antenatal diagnosis of human genetic disorders. 

XX 

SQ Sequence 196 BP; 102 A; 8 C; 85 G; 1 T; 0 other; 



Query Match 8.9%; Score 37.4; DB 22; Length 196; 

Best Local Similarity 53.0%; Pred. No. 0.019; 



Matches 


80; Conservative 0; Mismatches 71; Indels 0; Gaps 


Qy 


2 


aaaaaaataactcggaaaagaaggagacgccgaaaattcgaaaggggaggggaaagcaaa 
I 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 


61 


Db 


32 


aaaagcagcagcagcaggagaaggagaagaggaggaggaggaggaggaggagaaaaagga 


91 


Qy 


62 


gctgatggcggaggcccaggggaaagcaaagcaaatggcggaggccccgagcaagatcga 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 
ggaggaggaggagaaggaggaggaggaggagaaaaaggaggaggagaaggagaagaagaa 


121 


Db 


92 


151 


Qy 


122 


atccatgaggaagtgggtcgtcgagcacaag 152 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 
ggagaagaagaagaagaagaggaagaagaag 182 




Db 


152 





RESULT 8 
AAI41282 

ID AAI41282 standard; DNA; 503 BP. 
XX 

AC AAI41282; 
XX 

DT 17-OCT-2001 (first entry) 
XX 



DE Probe #9968 used to measure gene expression in human placenta sample. 
XX 

KW Probe; microarray; human; placenta; antenatal diagnosis; 

KW genetic disorder; ss. 

XX 

OS Homo sapiens. 
XX 

PN WO200157272-A2. 
XX 

PD 09-AUG-2001. 
XX 

PF 30-JAN-2001; 2001WO-US00 663 . 
XX 

PR 04-FEB-2000; 2000US-0180312 . 

PR 26-MAY-2000; 2000US-0207 4 56 . 

PR 30-JUN-2000; 2000US-06084 08 . 

PR 03-AUG-2000; 2000US-0632366 . 

PR 21-SEP-2000; 2000US-02 34 687 . 

PR 27-SEP-2000; 2000US-0236359 . 

PR 04-OCT-2000; 2000GB-00242 63 . 
XX 

PA (MOLE-) MOLECULAR DYNAMICS INC. 
XX 

PI Penn SG, Hanzel DK, Chen W, Rank DR; 
XX 

DR WPI; 2001-488897/53. 
XX 

PT Human genome-derived single exon nucleic acid probes useful for 

PT analyzing gene expression in human placenta - 

XX 

PS Claim 25; SEQ ID No 9968; 654pp; English. 
XX 

CC The present invention relates to single exon nucleic acid probes (SENP) . 

CC The present sequence is one such probe. The probes are useful for 

CC producing a microarray for predicting, measuring and displaying gene 

CC expression in samples- derived from human placenta. The probes are useful 

CC for antenatal diagnosis of human genetic disorders. 

XX 

SQ Sequence 503 BP; 207 A; 59 C; 143 G; 94 T; 0 other; 



Query Match 8.9%; Score 37.4; DB 22; Length 503; 

Best Local Similarity 53.0%; Pred. No. 0.03; 



Matches 


80; .Conservative 0; Mismatches 71; Indels 0; Gaps 


Qy 


2 


aaaaaaataactcggaaaagaaggagacgccgaaaattcgaaaggggaggggaaagcaaa 

1 1 1 1 1 I II 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 II 1 1 1 
aaaagcagcagcagcaggagaaggagaagaggaggaggaggaggaggaggagaaaaagga 


61 


Db 


306 


365 


Qy 


62 


gctgatggcggaggcccaggggaaagcaaagcaaatggcggaggccccgagcaagatcga 
1 1 MINI 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
ggaggaggaggagaaggaggaggaggaggagaaaaaggaggaggagaaggagaagaagaa 


121 


Db 


366 


425 


Qy 


122 


atccatgaggaagtgggtcgtcgagcacaag 152 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 
ggagaagaagaagaagaagaggaagaagaag 4 56 




Db 


426 





RESULT 9 
AAZ23891 

ID AAZ23891 standard; DNA; 49999 BP. 
XX 

AC AAZ23891; 
XX 

DT 25-JAN-2000 (first entry) 
XX 

DE Murine LOBO genomic DNA fragment 1. 
XX 

KW LOBO; long bones; bone development; bone extension; skull; osteopathic; 

KW diagnostic; pharmaceutical; gene therapy; transgenic animal; disease; 

KW spondyloepiphysal dysplasia; achondroplasia; murine; ds . 
XX 

OS Mus musculus. 
XX 

PN WO9950284-A2 . 
XX 

PD 07-OCT-1999. 
XX 

PF 26-MAR-1999; 99WO-EP02055 . 
XX 

PR 27-MAR-1998; 98DE-10137 99 . 
XX 

PA (ROSE/) ROSENTHAL A. 
XX 

PI Rosenthal A, Rump A, Hess J, Aigner T, Wirth T; 
XX 

DR WPI; 1999-601320/51. 
XX 

PT Nucleic acids encoding proteins which influence bone development, 

PT useful for treating and studying bone disorders - 

XX 

PS Example 3; Page 69-97; 391pp; German. 
XX 

CC This invention describes novel nucleic acids (I; designated LOBO (long 

CC bones)) encoding proteins influencing bone development in mammals. The 

CC proteins of the invention reduce and/or inactivate bone extension (i.e. 

CC development), with exception of the skull and have osteopathic activity. 

CC The nucleic acid molecules, proteins and antibodies can be used in 

CC diagnostic or pharmaceutical compounds e.g. for gene therapy. The methods 

CC and nucleic acid molecules, etc. are useful for production of transgenic 

CC animals, especially a transgenic mouse for the study of diseases 

CC associated with bone development, e.g. spondyloepiphysal dysplasia and 

CC achondroplasia. This sequence encodes the murine LOBO protein described 

CC in the method of the invention. 

XX 

SQ Sequence 49999 BP; 13210 A; 11814 C; 10825 G; 14150 T; 0 other; 



Query Match 8.5%; Score 35.6; DB 20; Length 49999; 

Best Local Similarity 53.6%; Pred. No. 0.89; 

Matches 74; Conservative 0; Mismatches 64; Indels 0; Gaps 0; 



Qy 15 ggaaaagaaggagacgccgaaaattcgaaaggggaggggaaagcaaagctgatggcggag 74 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III III I I I I 



Db 7 611 gaagaagaaggagacggagagaagaagaaggagaaggaaaaagagaagaagaagaaggag 7 670 



Qy 75 gcccaggggaaagcaaagcaaatggcggaggccccgagcaagatcgaatccatgaggaag 134 

II I I I I I I I I I I I I II I I I I I II I I II 

Db 7 671 aaggagaaagagaaggagaagaaggaggaggaggagaaggagaagaagaagaagaagaag 7730 

Qy 135 tgggtcgtcgagcacaag 152 

I I I I I I I 

Db 7731 aagaagaagaagaagaag 7748 



RESULT 10 
AAZ23896 

ID AA223896 standard; DNA; 49999 BP. 
XX 

AC AAZ23896; 
XX 

DT 25-JAN-2000 (first entry) 
XX 

DE Murine LOBO homologue genomic DNA fragment 2. 
XX 

KW LOBO; long bones; bone development; bone extension; skull; osteopathic; 

KW diagnostic; pharmaceutical; gene therapy; transgenic animal; disease; 

KW spondyloepiphysal dysplasia; achondroplasia; murine; ds . 
XX 

OS Mus musculus. 
XX 

PN WO9950284-A2. 
XX 

PD 07-OCT-1999. 
XX 

PF 26-MAR-1999; 99WO-EP02055 . 
XX 

PR 27-MAR-1998; 98DE-10137 99 . 
XX 

PA (ROSE/) ROSENTHAL A. 
XX 

PI Rosenthal A, Rump A, Hess J, Aigner T, Wirth T; 
XX 

DR WPI; 1999-601320/51. 
XX 

PT Nucleic acids encoding proteins which influence bone development, 

PT useful for treating and studying bone disorders - 

XX 

PS Example 3; Page 161-189; 391pp; German. 
XX 

CC This invention describes novel nucleic acids (I; designated LOBO (long 

CC bones)) encoding proteins influencing bone development in mammals. The 

CC proteins of the invention reduce and/or inactivate bone extension (i.e. 

CC development), with exception of the skull and have osteopathic activity. 

CC The nucleic acid molecules, proteins and antibodies can be used in 

CC diagnostic or pharmaceutical compounds e.g. for gene therapy. The methods 

CC and nucleic acid molecules, etc. are useful for production of transgenic 

CC animals, especially a transgenic mouse for the study of diseases 

CC associated with bone development, e.g. spondyloepiphysal dysplasia and 

CC achondroplasia. This sequence encodes the murine LOBO protein described 

CC in the method of the invention. 



XX 

SQ Sequence 49999 BP; 13135 A; 11787 C; 10868 G; 14209 T; 0 other; 



Query Match 8.5%; Score 35.6; DB 20; Length 49999; 

Best Local Similarity 53.6%; Pred. No. 0.89; 

Matches 74; Conservative 0; Mismatches 64; Indels 0; Gaps 0; 

Qy 15 ggaaaagaaggagacgccgaaaattcgaaaggggaggggaaagcaaagctgatggcggag 74 

I I I I I I M I I I I I I I I I I I I I I I I I I I I I I III III I I I I 
Db 9596 gaagaagaaggagacggagagaagaagaaggagaaggaaaaagagaagaagaagaaggag 9655 

Qy 75 gcccaggggaaagcaaagcaaatggcggaggccccgagcaagatcgaatccatgaggaag 134 

II I I I I I I I I I I II II III I I I I I I I I 

Db 9656 aaggagaaagagaaggagaagaaggaggaggaggagaaggagaagaagaagaagaagaag 9715 

Qy 135 tgggtcgtcgagcacaag 152 

I I I I I I I 

Db 9716 aagaagaagaagaagaag 9733 



RESULT 11 




AAI11457/C 




t r\ 
1 IJ 


AAI11457 standard; DNA; 475 BP. 


XX 






AC 


AAI11457; 




XX 






U 1 


12-OCT-2001 


(first entry) 


XX 






DE 


Probe #1390 


for gene expression analysis in human cervical cell sample 


XX 






KW 


Probe; human; microarray; gene expression; cervical epithelial cell; 


KW 


cervical cancer; ss. 


XX 






OS 


Homo sapiens 




XX 






PN 


WO200157278- 


A2 . 


XX 






PD 


09-AUG-2001. 




XX 






PF 


30-JAN-2001; 


2001WO-US00670. 


XX 






PR 


04-FEB-2000; 


2000US-0180312. 


PR 


26-MAY-2000; 


2000US-0207456. 


PR 


30-JUN-2000; 


2000US-0608408. 


PR 


03-AUG-2000; 


2000US-0632366. 


PR 


21-SEP-2000; 


2000US-0234687. 


PR 


27-SEP-2000; 


2000US-0236359. 


PR 


04-OCT-2000; 


2000GB-0024263. 


XX 






PA 


(MOLE- ) MOLECULAR DYNAMICS INC. 


XX 






PI 


Penn SG, Hanzel DK, Chen W, Rank DR; 


XX 






DR 


WPI; 2001-488901/53. 


XX 






PT 


Human genome 


-derived single exon nucleic acid probes useful for 



PT analyzing gene expression in human cervical epithelial cells - 
XX 

PS Claim 25; SEQ ID No 1390; 487pp; English. 
XX 

CC The present invention relates to human single exon nucleic acid probes 

CC (SENP) . The present sequence is one such probe. The SENPs are derived 

CC from human HeLa cells. The SENPs can be used to produce a single exon 

CC microarray, which can be used for measuring human gene expression in a 

CC sample derived from human cervical epithelial cells. By measuring gene 

CC expression, the probes are therefore useful in grading and/or staging 

CC of diseases of the cervix, notably cervical cancer. 

CC Note: The sequence data for this patent did not form part of the printed 

CC specification, but was obtained in electronic format directly from WIPO 

CC at ftp.wipo.int/pub/published_pct__sequences. 
XX 

SQ Sequence 475 BP; 24 A; 201 C; 30 G; 220 T; 0 other; 



Query Match 8.4%; Score 35.2; DB 22; Length 475; 

Best Local Similarity 52.0%; Pred. No. 0.15; 



Matches 


79; Conservative 0; Mismatches 73; Indels 0; Gaps 


Qy 


1 


gaaaaaaataactcggaaaagaaggagacgccgaaaattcgaaaggggaggggaaagcaa 

Mill 1 II II 1 1 III 1 II 1 1 1 1 1 1! 1 1 1 
GAAGAAGGAGAGAAGAAGGAGAGGAGGAGGAGCAGGATGAGGAGGAGCAGGAGGAGGAGG 


60 


Db 


314 


255 


Qy 


61 


agctgatggcggaggcccaggggaaagcaaagcaaatggcggaggccccgagcaagatcg 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
AGCAGGAGGAGGAGGAGCAGGAGAAGGAGGAGCAGGAGAAGGAGGAGGGGGAGAAGAAGA 


120 


Db 


254 


195 


Qy 


121 


aatccatgaggaagtgggtcgtcgagcacaag 152 
1 i 11 1 1 1 1 II 1 1 1 1 II 1 
AGGAGAAGAAGAAGGAGGAGAAGGAGGAGAAG 163 




Db 


194 





RESULT 12 
AAI32728/c 

ID AAI32728 standard; DNA; 475 BP. 
XX 

AC AAI32728; 
XX 

DT 17-OCT-2001 (first entry) 
XX 

DE Probe #1414 used to measure gene expression - in human placenta sample. 
XX 

KW Probe; microarray; human; placenta; antenatal diagnosis; 

KW genetic disorder; ss . 

XX 

OS Homo sapiens . 
XX 

PN WO200157272-A2. 
XX 

PD 09-AUG-2001. 
XX 

PF 30-JAN-2001; 2001WO-US00663 . 
XX 

PR 04-FEB-2000; 2000US-0180312 . 
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Human genome-derived single exon nucleic acid probes 


useful for 




r 1 


analyzing gene expression in human placenta - 
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Claim 25; SEQ ID No 1414; 654pp; English. 






v V 

AA 










The present invention relates to single exon nucleic 


acid probes 


(SENP) . 


CC 


The present sequence is one such probe. The probes are useful for 




CC 


producing a microarray for predicting, measuring and 


displaying gene 


cc 


expression in samples derived from human placenta. The probes are 


useful 


cc 


for antenatal diagnosis of human genetic disorders. 






XX ' 








SQ 


Sequence 475 BP; 24 A; 201 C; 30 G; 220 T; 0 other; 







Query Match 8.4%; Score 35.2; DB 22; Length 475; 

Best Local Similarity 52.0%; Pred. No. 0.15; 

Matches 79; Conservative 0; Mismatches 73; Indels 0; Gaps 0; 

Qy 1 gaaaaaaataactcggaaaagaaggagacgccgaaaattcgaaaggggaggggaaagcaa 60 

I I I I I I II I I I I III I II I I I I I II I I I 
Db 314 GAAGAAGGAGAGAAGAAGGAGAGGAGGAGGAGCAGGATGAGGAGGAGCAGGAGGAGGAGG 255 

Qy 61 agctgatggcggaggcccaggggaaagcaaagcaaatggcggaggccccgagcaagatcg 120 

I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I 
Db 254 AGCAGGAGGAGGAGGAGCAGGAGAAGGAGGAGCAGGAGAAGGAGGAGGGGGAGAAGAAGA 195 



Qy 121 aatccatgaggaagtgggtcgtcgagcacaag 152 

I I I I I I I I II I I I I I I I 

Db 194 AGGAGAAGAAGAAGGAGGAGAAGGAGGAGAAG 163 



RESULT 13 
AAI01373/C 

ID AAI01373 standard; DNA; 475 BP. 
XX 

AC AAI01373; 
XX 

DT 09-OCT-2001 (first entry) 
XX 

DE Probe #1364 used to measure gene expression in human breast sample. 
XX 

KW Probe; human; breast disease; breast cancer; development disorder; ss; 

KW inflammatory disease; proliferative breast disease; non-carcinoma tumour. 
XX 



OS Homo sapiens. 
XX 

PN WO200157270-A2. 
XX 

PD 09-AUG-2001. 
XX 

PF 29-JAN-2001; 2001WO-US00661 . 
XX 

PR 04-FEB-2000; 2000US-0180312 . 

PR 26-MAY-2000; 2000US-0207456 . 

PR 30-JUN-2000; 2000US-0608408 . 

PR 03-AUG-2000; 2000US-0632366 . 

PR 21-SEP-2000; 2000US-0234 687 . 

PR 27-SEP-2000; 2000US-0236359 . 

PR 04-OCT-2000; 2000GB-0024263 . 
XX 

PA (MOLE-) MOLECULAR DYNAMICS INC. 
XX 

PI Penn SG, Hanzel DK, Chen W, Rank DR; 
XX 

DR WPI; 2001-476286/51. 
XX 

PT Novel single exon nucleic acid probe used to measuring gene expression 

PT in a human breast - 

XX 

PS Claim 25; SEQ ID No 1364; 322pp; English. 
XX 

CC The present invention relates to novel single exon nucleic acid probes. 

CC The present sequence is one such probe. The probes are useful for 

CC measuring human gene expression in a human breast sample, where the probe 

CC hybridises at high stringency to a nucleic acid expressed in the human 

CC breast. The probes are useful for predicting, diagnosing, grading, 

CC staging, monitoring and prognosing diseases of the human breast, 

CC particularly those diseases with polygenic aetiology. The diseases 

CC include: breast cancer, disorders of development, inflammatory diseases 

CC of the breast, fibrocystic changes, proliferative breast disease and 

CC non-carcinoma tumours. 

CC Note: The sequence data for this patent did not form part of the printed 

CC specification, but was obtained in electronic format directly from WIPO 

CC at ftp. wipo. int/pub/published_pct_sequences . 
XX 

SQ Sequence 475 BP; 24 A; 201 C; 30 G; 220 T; 0 other; 



Query Match 8.4%; Score 35.2; DB 22; Length 475; 

Best Local Similarity 52.0%; Pred. No. 0.15; 

Matches 79; Conservative 0; Mismatches 73; Indels 0; Gaps 0; 

Qy 1 gaaaaaaataactcggaaaagaaggagacgccgaaaattcgaaaggggaggggaaagcaa 60 

I I I I I I II I I I I III I II I I I I I I I I I I 
Db 314 GAAGAAGGAGAGAAGAAGGAGAGGAGGAGGAGCAGGATGAGGAGGAGCAGGAGGAGGAGG 255 

Qy 61 agctgatggcggaggcccaggggaaagcaaagcaaatggcggaggccccgagcaagatcg 120 

I I I I I I I IE I I II I I II I I I I I I I I I I I I I I I I I 
Db 254 AGCAGGAGGAGGAGGAGCAGGAGAAGGAGGAGCAGGAGAAGGAGGAGGGGGAGAAGAAGA 195 



Qy 121 aatccatgaggaagtgggtcgtcgagcacaag 152 



Db 194 AGGAGAAGAAGAAGGAGGAGAAGGAGGAGAAG 163 



RESULT 14 
AAI20671/C 

ID AAI20671 standard; DNA; 512 BP. 
XX 

AC AAI20671; 
XX 

DT 12-OCT-2001 (first entry) 
XX 

DE Probe #10604 for gene expression analysis in human cervical cell sample. 
XX 

KW Probe; human; microarray; gene expression; cervical epithelial cell; 

KW cervical cancer; ss. 

XX 

OS Homo sapiens. 
XX 

PN WO200157278-A2. 
XX 

PD 09-AUG-2001. 
XX 

PF 30-JAN-2001; 2001WO-US00670 . 
XX 

PR 04-FEB-2000; 2000US-0180312 . . 

PR 26-MAY-2000; 2000US-02074 56 . 

PR 30-JUN-2000; 2000US-0 6084 08 . 

PR 03-AUG-2000; 2000US-0 632366 . 

PR 21-SEP-2000; 2000US-0234 687 . 

PR 27-SEP-2000; 2000US-0236359 . 

PR 04-OCT-2000; 2000GB-00242 63 . 
XX 

PA (MOLE- ) MOLECULAR DYNAMICS INC. 
XX 

PI Penn SG, Hanzel DK, Chen W, Rank DR; 
XX 

DR WPI; 2001-488901/53. 
XX 

PT Human genome-derived single exon nucleic acid probes useful for 

PT analyzing gene expression in human cervical epithelial cells - 
XX 

PS Claim 25; SEQ ID No 10604; 487pp; English. 
XX 

CC The present invention relates to human single exon nucleic acid probes 

CC (SENP) . The present sequence is one such probe. The SENPs are derived 

CC from human HeLa cells. The SENPs can be used to produce a single exon 

CC microarray, which can be used for measuring human gene expression in a 

CC sample derived from human cervical epithelial cells. By measuring gene 

CC expression, the probes are therefore useful in grading and/or staging 

CC of diseases of the cervix, notably cervical cancer. 

CC Note: The sequence data for this patent did not form part of the printed 

CC specification, but was obtained in electronic format directly from WIPO 

CC at f tp. wipo . int/pub/published_pct_sequences . 
XX 

SQ Sequence 512 BP; 18 A; 231 C; 28 G; 235 T; 0 other;. 



Query Match 8.4%; Score 35.2; DB 22; 

Best Local Similarity 52.0%; Pred. No. 0.15; 
Matches 79; Conservative 0; Mismatches 73; 



Length 512; 

Indels 0; Gaps 0; 



Qy 


1 


gaaaaaaataactcggaaaagaaggagacgccgaaaattcgaaaggggaggggaaagcaa 

1 1 1 1 1 1 II 1 1 1 1 III 1 II 1 1 1 1 1 1 1 1 1 I 
GAAGAAGGAGAGAAGAAGGAGAGGAGGAGGAGCAGGATGAGGAGGAGCAGGAGGAGGAGG 


60 


Db 


180 


121 


Qy 


61 


agctgatggcggaggcccaggggaaagcaaagcaaatggcggaggccccgagcaagatcg 

Mil 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 
AGCAGGAGGAGGAGGAGCAGGAGAAGGAGGAGCAGGAGAAGGAGGAGGGGGAGAAGAAGA 


120 


Db 


120 


61 



Qy 121 aatccatgaggaagtgggtcgtcgagcacaag 152 

I I I I I I I I II I II I I I I 

Db 60 AGGAGAAGAAGAAGGAGGAGAAGGAGGAGAAG 2 9 



RESULT 15 
AAI45882/C 

ID AAI45882 standard; DNA; 512 BP. 
XX 

AC AAI45882; 
XX 

DT 17-OCT-2001 (first entry) 
XX 

DE Probe #14568 used to measure gene expression in human placenta sample, 
XX 

KW Probe; microarray; human; placenta; antenatal diagnosis; 

KW genetic disorder; ss. 

XX 

OS Homo sapiens . 
XX 

PN WO200157272-A2. 
XX 

PD 09-AUG-2001. 
XX 

PF 30-JAN-2001; 2001WO-US00663 . 
XX 

PR 04-FEB-2000; 2000US-0180312 . 

PR 26-MAY-2000; 2000US-0207 4 56 . 

PR 30-JUN-2000; 2000US-0 6084 08 . 

PR 03-AUG-2000; 2000US-0 6323 66 . 

PR 21-SEP-2000; 2000US-0234 687 . 

PR 27-SEP-2000; 2000US-0236359 . 

PR 04-OCT-2000; 2000GB-0024 2 63 . 
XX 

PA (MOLE-) MOLECULAR DYNAMICS INC. 
XX 

PI Penn SG, Hanzel DK f Chen W, Rank DR; 
XX 

DR WPI; 2001-488897/53. 
XX 

PT Human genome-derived single exon nucleic acid probes useful for 

PT analyzing gene expression in human placenta - 

XX 

PS Claim 25; SEQ ID No 14568; 654pp; English. 



XX 

CC The present invention relates to single exon nucleic acid probes (SENP) . 

CC The present sequence is one such probe. The probes are useful for 

CC producing a microarray for predicting, measuring and displaying gene 

CC expression in samples derived from human placenta. The probes are useful 

CC for antenatal diagnosis of human genetic disorders. 

XX 

SQ Sequence 512 BP; 18 A; 231 C; 28 G; 235 T; 0 other; 



Query Match 8.4%; Score 35.2; DB 22; Length 512; 

Best Local Similarity 52.0%; Pred. No. 0.15; 

Matches 79; Conservative 0; Mismatches 73; Indels 0; Gaps 0; 

Qy 1 gaaaaaaataactcggaaaagaaggagacgccgaaaattcgaaaggggaggggaaagcaa 60 

I I I I I I II I I I I III I II I I I I I I I I I I 
Db 180 GAAGAAGGAGAGAAGAAGGAGAGGAGGAGGAGCAGGATGAGGAGGAGCAGGAGGAGGAGG 121 

Qy 61 agctgatggcggaggcccaggggaaagcaaagcaaatggcggaggccccgagcaagatcg 120 

I I I I I I I I I I I I I I I I I I I MM. I Mill I I I I I 
Db 120 AGCAGGAGGAGGAGGAGCAGGAGAAGGAGGAGCAGGAGAAGGAGGAGGGGGAGAAGAAGA 61 

Qy 121 aatccatgaggaagtgggtcgtcgagcacaag 152 

I I II I I I I II I II I I I I 

Db 60 AGGAGAAGAAGAAGGAGGAGAAGGAGGAGAAG 2 9 



Search completed: February 7, 2002, 10:59:33 
Job time: 4959 sec 

GenCore version 4.5 
Copyright (c) 1993 - 2000 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 
Run on : 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



Searched: 



February 7, 2002, 09:10:34 ; Search time 172.96 Seconds 

(without alignments) 
551.268 Million cell updates/sec 

US-09-394-745-5893 
421 

1 gaaaaaaataactcggaaaa ccatgttggttcctgcatgc 421 

IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 



351203 seqs, 113238999 residues 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 



702406 



Post-processing: Minimum Match 0% 

Maximum Match 100% 



Listing first 45 summaries 



Database : Issued_Patents_NA: * 

1 : /cgn2_6/ptodata/2/ina/5A_COMB. seq': * 

2 : /cgn2_6/ptodata/2/ina/5B_COMB. seq: * 

3 : /cgn2_6/ptodata/2/ina/6A_COMB . seq: * 

4 : /cgn2_6/ptodata/2/ina/6B_COMB. seq: * 

5: /cgn2_6/ptodata/2/ina/PCTUS_COMB. seq: * 

6: /cgn2_6/ptodata/2/ina/backf ilesl .seq: * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 

% 



Result 
No. 


Score 


Query 
Match 


Length 


DB 


ID 


Description 


c 


1 


49 


.6 


11 


8 


7218 


1 


US -08-232-4 63-14 


Sequence 


14, Appl 




2 


38 


.2 


9 


1 


289 


4 


US -09-007-005-17 


Sequence 


17, Appl 




3 


38 


.2 


9 


1 


28 9 


4 


no A A *">/!/! TAT IT 

US -09-24 4-7 96-17 


Sequence 


17, Appl 




4 


34 


8 


1 


37 5 6 


2 


US-08-57 6-626A-1 


Sequence 


1, Appli 


c 


5 


33 


. 2 


7 


9 


17 63 


6 


519854 2-1 


Patent No. 


5198542 




6 


32 


. 6 


7 


7 


1027 


2 


US-08-867-087B-54 


Sequence 


54, Appl 




7 


31 


. 8 


7 


6 


248 


4 


US -09-007-00 5- 32 


Sequence 


32, Appl 




8 


31 


. 8 


7 


6 


248 


4 


US -09-244-7 96-32 


Sequence 


32, Appl 




9 


31 


. 8 


7 


6 


277 


4 


US -09-007-005-3 


Sequence 


3, Appli 




10 


31 


. 8 


7 


6 


277 


4 


US-09-24 4-7 96-3 


Sequence 


3, Appli 


c 


11 


31 


. 8 


7 


6 


242 6 


5 


PCT-US91-094 22-20 


Sequence 


20, Appl 


c 


12 


31 


. 4 


7 


5 


24 979 


2 


US-08-147-777-3 


Sequence 


3, Appli 


c 


13 


31 


. 4 


7 


5 


24979 


3 


US-08-452-872-3 


Sequence 


3, Appli 


c 


14 


31 


.4 


7 


5 


24 979 


5 


PCT-US93-03985-3 


Sequence 


3, Appli 


c 


15 


30 


. 8 


7 


3 


4897 


6 


5196516-7 


Patent No. 


5196516 




16 


30 


.2 


7 


2 


1066 


1 


US-08-31 4 -309A-18 


Sequence 


18, Appl 




17 


30 


.2 


7 


2 


2076 


5 


PCT-US91-08442-1 


Sequence 


1, Appli 




18 


30 


.2 


7 


2 


2079 


3 


US-08-557-210A-6 


Sequence 


6, Appli 




19 


30 


.2 


7 


2 


2109 


3 


US-08-557-210A-7 


Sequence 


7, Appli 




20 


30 


.2 


7 


2 


2109 


3 


US-08-557-210A-8 


Sequence 


8, Appli 




21 


30 


.2 


7 


2 


2300 


4 


US-09-344-438-2 


Sequence 


2, Appli 




22 


30 


.2 


7 


2 


2580 


2 


US-08-887-798-1 


Sequence 


1, Appli 




23 


30 


.2 


7 


2 


3237 


2 


US-08-419-075-26 


Sequence 


26, Appl 




24 


30 


.2 


7 


2 


3337 


1 


US-08-072-610-1 


Sequence 


1, Appli 




25 


30 


.2 


7 


2 


3337 


2 


US-08-719-822B-1 


Sequence 


1, Appli 




26 


30 


.2 


7 


.2 


3337 


4 


US-09-092-458-1 


Sequence 


1, Appli 


c 


27 


30 


.2 


7 


.2 


6623 


2 


US-08-244-434-36 


Sequence 


36, Appl 


c 


28 


30 


.2 


7 


.2 


6630 


2 


US-08-244-434-37 


Sequence 


37, Appl 




29 


30 


.2 


7 


.2 


7305 


1 


US-08-286-740-4 


Sequence 


4, Appli 




30 


30 


.2 


7 


.2 


7305 


5 


PCT-US95-09576-4 


Sequence 


4, Appli 


c 


31 


30 


.2 


7 


.2 


8575 


5 


PCT-US92-08258-6 . 


Sequence 


6, Appli 


c 


32 


30 


.2 


7 


.2 


8932 


2 


US-08-252-493C-8 


Sequence 


8, Appli 


c 


33 


30 


.2 


7 


.2 


8932 


3 


US-09-276-197-8 


Sequence 


8, Appli 


c 


34 


30 


.2 


7 


.2 


10580 


1 


US-08-196-259-1 


Sequence 


1, Appli 


c 


35 


30 


.2 


7 


.2 


11616 


1 


US-08-196-259-2 


Sequence 


2, Appli 




36 


30 


7 


.1 


118 


1 


US-08-182-175A-52 


Sequence 


52, Appl 




37 


30 


7 


.1 


118 


1 


US-08-474-633A-61 


Sequence 


61, Appl 




38 


30 


7 


.1 


118 


5 


PCT-US92-06412-52 


Sequence 


52, Appl 



c 






7 


± 




*j 
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c 


40 


30 


1 


1 


1223 


4 


US-08-957-351-29 


Sequence 


2 9, Appl 


c 


41 


30 


1 


1 


1240 


4 


US-08-957-351-8 


Sequence 


8, Appli 




42 


30 


1 


1 


1276 


4 


US-09-177-325-2 


Sequence 


2, Appli 




43 


30 


1 


1 


1276 


4 


US-09-411-812A-2 


Sequence 


2, Appli 




44 


30 


1 


1 


1276 


4 


US-09-590-113-2 


Sequence 


2, Appli 




45 


29.8 


7 


1 


15144 


3 


US-08-458-434A-6 


Sequence 


6, Appli 



ALIGNMENTS 



RESULT 1 
US-08-232-463-14/C 

; Sequence 14, Application US/08232463 
; Patent No. 5670367 
; GENERAL INFORMATION: 

APPLICANT: DORNER, F. 

APPLICANT: SCHEIFLINGER, F. 

APPLICANT: FALKNER, F. G. 

TITLE OF INVENTION : RECOMBINANT FOWLPOX VIRUS 
NUMBER OF SEQUENCES: 52 
CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Foley & Lardner 

STREET: 1800 Diagonal Road, Suite 500 
/ CITY: Alexandria 

STATE: VA 

COUNTRY: USA 

ZIP: 22313-0299 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/232,4 63 

FILING DATE: 

CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US/07/935,313 

FILING DATE: 

APPLICATION NUMBER: EP 91 114 300.6 

FILING DATE: 26-AUG-1991 
ATTORNEY/AGENT INFORMATION: 

NAME: BENT, Stephen A. 

REGISTRATION NUMBER: 29,768 

REFERENCE/DOCKET NUMBER: 30472/114 IMMU 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (7 03)836-9300 

TELEFAX: (703)683-4109 

TELEX: 899149 
INFORMATION FOR SEQ ID NO: 14: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 7218 base pairs 
; TYPE: nucleic acid 

STRANDEDNESS: single 

TOPOLOGY: linear 



IMMEDIATE SOURCE: 
CLONE: pTZgpt-Fls 
US-08-232-463-14 



Query Match 11.8%; Score 49.6; DB 1; Length 7218; 

Best Local Similarity 6.4%; Pred. No. 5.2e-06; 

Matches 10; Conservative 106; Mismatches 40; Indels 0; Gaps 0 
Qy 1 gaaaaaaataactcggaaaagaaggagacgccgaaaattcgaaaggggaggggaaagcaa 60 



Db 1210 RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR 1151 
Qy 61 agctgatggcggaggcccaggggaaagcaaagcaaatggcggaggccccgagcaagatcg 120 



Db 1150 RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR 1091 

Qy 121 aatccatgaggaagtgggtcgtcgagcacaagctcc 156 

: : ::::::::::: : : I 1 I I I I I I M 
Db 1090 RRRRRRRRRRRRRRRRRRRRRRRRATCGCAAGCTCC 1055 



RESULT 2 
US-09-007-005-17 

; Sequence 17, Application US/09007005B 

; Patent No. 6258558 

; GENERAL INFORMATION: 

; APPLICANT: Szostak, Jack W. 

; APPLICANT: Roberts, Richard W. 

; APPLICANT: Liu, Rihe 

; TITLE OF INVENTION: SELECTION OF PROTEINS USING RNA-PROTEIN 

; TITLE OF INVENTION: FUSIONS 

; FILE REFERENCE: 00786/350003 

; CURRENT APPLICATION NUMBER: US/09/007 , 005B 

; CURRENT FILING DATE: 1998-01-14 

; EARLIER APPLICATION NUMBER: 60/035,963 

; EARLIER FILING DATE: 1997-01-27 

; EARLIER APPLICATION NUMBER: 60/064,491 

; EARLIER FILING DATE: 1997-11-06 

; NUMBER OF SEQ ID NOS : 33 

SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 17 

LENGTH: 28 9 

TYPE: RNA 

ORGANISM: Artificial Sequence 
FEATURE : 

OTHER INFORMATION: Translation template 
FEATURE : 

NAME/KEY: misc_feature 
LOCATION: (1) . . . (289) 
OTHER INFORMATION: n = A,T,C or G 
US-09-007-005-17 



Query Match 9.1%; Score 38.2; DB 4; Length 289; 

Best Local Similarity 5.9%; Pred. No. 0.005; 

Matches 14; Conservative 103; Mismatches 120; Indels 0; Gaps 0 



Qy 4 aaaaataactcggaaaagaaggagacgccgaaaattcgaaaggggaggggaaagcaaagc 63 



Db 19 rarcrurarurururarcrararururarcrararurgrnrnrsrnrnrsrnrnrsrnrn 78 

Qy 64 tgatggcggaggcccaggggaaagcaaagcaaatggcggaggccccgagcaagatcgaat 123 



Db 79 rsrnrnrsrnrnrsrnrnrsrnrnrsrnrnrsrnrnrsrnrnrsrnrnrsrnrnrsrnm 138 

Qy 124 ccatgaggaagtgggtcgtcgagcacaagctccgagccgtaggttgcctctggctaggtg 183 



Db 139 rsrnrnrsrnrnrsrnrnrsrnrnrsrnrnrsrnrnrsrnrnrsrnrnrsrnrnrsrnrn 198 

Qy 184 ggatcagcagttcgatcgcctacaactggtcgcggcccaatatgaagcctagcgtca 24 0 

: : : : : : : : : | : : : | : : I I : : I : : I I : 
Db 199 rsrnrnrsrnrnrsrnrnrsrcrargrcrurgrcrgrurararcrurcrururgrcr 255 



RESULT 3 
US-09-244-796-17 

; Sequence 17, Application US/09244796 

; Patent No. 6281344 

; GENERAL INFORMATION: 

; APPLICANT: Szostak, Jack W. 

; APPLICANT: Roberts, Richard W. 

; APPLICANT: Liu, Rihe 

; TITLE OF INVENTION: SELECTION OF PROTEINS USING RNA- PROTEIN 
; TITLE OF INVENTION : FUSIONS 

FILE REFERENCE: 00786/350007 
; CURRENT APPLICATION NUMBER: US/09/24 4,796 
; CURRENT FILING DATE: 1999-02-05 
/ EARLIER APPLICATION NUMBER: 60/035,963 
; EARLIER FILING DATE: 1997-01-27 
; EARLIER APPLICATION NUMBER: 60/064,491 
; EARLIER FILING DATE: 1997-11-06 
; EARLIER APPLICATION NUMBER: 09/007,005 
; EARLIER FILING DATE: 1998-01-14 
; NUMBER OF SEQ ID NOS : 33 

SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 17 

LENGTH: 289 

TYPE: RNA 
; ORGANISM: Artificial Sequence 

FEATURE: 

OTHER INFORMATION: Translation template 
FEATURE : 

NAME/KEY: misc_f eature 
LOCATION: (1) . . . (289) 
OTHER INFORMATION: n = A,T,C or G 
US-09-244-796-17 



Query Match 9.1%; Score 38.2; DB 4; Length 289; 

Best Local Similarity 5.9%; Pred. No. 0.005; 

Matches 14; Conservative 103; Mismatches 120; Indels 0; Gaps 0; 



Qy 4 aaaaataactcggaaaagaaggagacgccgaaaattcgaaaggggaggggaaagcaaagc 63 



: I : : : : I : : : | : : | : : : : : | : | : | ::::::::: : : : 
Db 19 rarcrurarurururarcrararururarcrararurgrnrnrsrnrnrsrnrnrsrnrn 7 8 

Qy 64 tgatggcggaggcccaggggaaagcaaagcaaatggcggaggccccgagcaagatcgaat 123 

Db 79 rsrnrnrsrnrnrsrnrnrsrnrnrsrnrnrsrnrnrsrnrnrsrnrnrsrnrnrsrnrn 138 

Qy 124 ccatgaggaagtgggtcgtcgagcacaagctccgagccgtaggttgcctctggctaggtg 183 

Db 139 rsrnrnrsrnrnrsrnrnrsrnrnrsrnrnrsrnrnrsrnrnrsrnrnrsrnrnrsrnrn 198 

Qy 184 ggatcagcagttcgatcgcctacaactggtcgcggcccaatatgaagcctagcgtca 240 

: : : : : : : : : I : : : I : : I I : : I : : II: 
Db 199 rsrnrnrsrnrnrsrnrnrsrcrargrcrurgrcrgrurararcrurcrururgrcr 255 



RESULT 4 
US-08-576-626A-1 

Sequence 1, Application US/08576626A 
Patent No. 5998194 
GENERAL INFORMATION: 

APPLICANT : Summers, R. G. 
APPLICANT: Katz, L. 
APPLICANT : Donadio, S . 
APPLICANT: Staver, M.J. 

TITLE OF INVENTION: POLYKETIDE-ASSOCIATED SUGAR 
TITLE OF INVENTION: BIOSYNTHESIS GENES 
NUMBER OF SEQUENCES: 60 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Abbott Laboratories 
STREET: 100 Abbott Park Road 
CITY: Abbott Park 
STATE: Illinois 
COUNTRY: USA 
ZIP: 60064-3500 
COMPUTER READABLE FORM: 
MEDIUM TYPE: Diskette 
COMPUTER: IBM Compatible 
OPERATING SYSTEM: DOS 
SOFTWARE: FastSEQ Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/576, 626A 
FILING DATE: 21-DEC-1995 
CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 
APPLICATION NUMBER: 
FILING DATE: 
ATTORNEY/AGENT INFORMATION: 
NAME: Dianne Casuto 
REGISTRATION NUMBER: P-40, 943 
REFERENCE/DOCKET NUMBER: 5857. US. 01 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (847) 938-3137 
TELEFAX: (847) 938-2623 
TELEX: 

INFORMATION FOR SEQ ID NO: 1: 
SEQUENCE CHARACTERISTICS: 



LENGTH: 3756 base pairs 
TYPE: nucleic acid 
STRANDEDNESS: single 
TOPOLOGY: linear 
US-08-576-626A-1 



Query Match 8.1%; Score 34; DB 2; Length 3756; 

Best Local Similarity 53.8%; Pred. No. 0.35;, 

Matches 70; Conservative 0; Mismatches 60; Indels 0; Gaps 0; 

Qy 51 gggaaagcaaagctgatggcggaggcccaggggaaagcaaagcaaatggcggaggccccg 110 

III II Ml I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 3613 GGGTGCGCCGGCCTGGAGGAGGCCAACCAGGAGCTGGCGAACCAGCTCGCCGAGGCCGCG 3672 

Qy 111 agcaagatcgaatccatgaggaagtgggtcgtcgagcacaagctccgagccgtaggttgc 170 

II I I I I I I I I I I I I I I I I I II I I I I II 

Db 3673 GGGATCAGCGAGGGCGACGAGGTGCTCGACGTCGGGTTCGGGCTCGGCGCGCAGGACTTC 3732 

Qy 171 ctctggctag 180 

I I I I I II I 
Db 3733 TTCTGGCTCG 3742 



RESULT 5 
5198542-1/c 
;Patent No. 5198542 

APPLICANT: ONDA, HARUO; ARIMURA, AKIRA; KIMURA, CHIHARU 
; KIT ADA, CHIEKO 

TITLE OF INVENTION: DNA ENCODING A PITUITARY ADENYLATE CYCLASE 
/ACTIVATING PROTEIN AND USE THEREOF 
NUMBER OF SEQUENCES: 16 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/07/540,105 
FILING DATE: 10-JUN-1990 
;SEQ ID NO:l: 

LENGTH: 17 63 
5198542-1 



Query Match 7.9%; Score 33.2; DB 6; Length 1763; 

Best Local Similarity 50.6%; Pred. No. 0.44; 

Matches 80; Conservative 0; Mismatches 78; Indels 0; Gaps 0; 

Qy 19 aagaaggagacgccgaaaattcgaaaggggaggggaaagcaaagctgatggcggaggccc 78 

I I I I I I I I I I I I I II I I I I Mill II II I I 

Db 312 AAGACGGAGGAGGCGACGAGCTGAGTGGGGGTGGGAGACAAAGTGGCCTGAAGTCCACTG 253 



Qy 7 9 aggggaaagcaaagcaaatggcggaggccccgagcaagatcgaatccatgaggaagtggg 138 

M I II I I I I II I I I II III II I II III I I I I 

Db 252 AGAAGAAAGGAGTGATAAGGAAAGAAGCAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAG 193 

Qy 139 tcgtcgagcacaagctccgagccgtaggttgcctctgg 176 

III M II I I I I I I I I II I I 
Db 192 AGAGAGAGAGAGAGGGCCTCGGCGAAGTCGGCGTCTGG 155 



RESULT 6 
US-08-867-087B-54 

; Sequence 54, Application US/08867087B 

; Patent No. 5990386 

; GENERAL INFORMATION: 

; APPLICANT: An, Gynheung 

TITLE OF INVENTION: GENES CONTROLLING FLORAL DEVELOPMENT 
TITLE OF INVENTION: AND APICAL DOMINANCE IN PLANTS 
NUMBER OF SEQUENCES: 70 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Klarquist Sparkman Campbell Leigh & 

ADDRESSEE: Whinston, LLP 

STREET: One World Trade Center 

STREET: 121 S.W. Salmon Street 

STREET: Suite 1600 

CITY: Portland 

STATE: Oregon 

COUNTRY: United States of America 

ZIP : 97204 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Disk, 3-1/2 inch ' 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: MS DOS 

SOFTWARE: WordPerfect 5.1 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08 /8 67 , 087B 

FILING DATE: June 2, 1997 

CLASSIFICATION: 800 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: U.S. 08/323,449 

FILING DATE: October 14, 1994 

APPLICATION NUMBER: U.S. 08/485,981 

FILING DATE: June 7, 1995 
ATTORNEY/AGENT INFORMATION: 

NAME: Dow, Alan. E. 

REGISTRATION NUMBER: 35,123 

REFERENCE/DOCKET NUMBER: 4630-47071 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (503) 226-7391 

TELEFAX: (503) 228-9446 
; INFORMATION FOR SEQ ID NO: 54: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 1027 base pairs 

TYPE: nucleic acid 

STRANDEDNESS: double stranded 

TOPOLOGY: linear 
US-08-867-087B-54 



Query Match 7.7%; Score 32.6; DB 2; Length 1027; 

Best Local Similarity 51.7%; Pred. No. 0.54; 

Matches 74; Conservative 0; Mismatches 69; Indels 0; Gaps 0; 

Qy 20 agaaggagacgccgaaaattcgaaaggggaggggaaagcaaagctgatggcggaggccca 7 9 

III III III I I I I III I I I I I I I I I I I I I I I I I I 

Db 103 AGAGAGAGCTAGAGAGAGATCGATGGGGCGAGGGAAAGTAGAGCTGAAGCGGATCGAGAA 162 



Qy 80 ggggaaagcaaagcaaatggcggaggccccgagcaagatcgaatccatgaggaagtgggt 139 

I I I III I I II II I I I I I I I I I I I I I I I I 

Db 163 CAAGATAAGCCGGCAGGTGACGTTCGCGAAGAGGAGGAACGGGCTGCTGAAGAAGGCGTA 222 

Qy 140 cgtcgagcacaagctccgagccg 162 

II I I I I I I I I I I 
Db 223 CGAGCTGTCCGTGCTCTGCGACG 24 5 



RESULT 7 
US-09-007-005-32 

/ Sequence 32, Application US/09007005B 

; Patent No. 6258558 

; GENERAL INFORMATION: 

; APPLICANT: Szostak, Jack W. 

; APPLICANT: Roberts, Richard W. 

; APPLICANT: Liu, Rihe 

; TITLE OF INVENTION: SELECTION OF PROTEINS USING RNA-PROTEIN 

; TITLE OF INVENTION: FUSIONS 

; FILE REFERENCE: 00786/350003 

; CURRENT APPLICATION NUMBER: US/09/007 , 005B 

; CURRENT FILING DATE: 1998-01-14 

; EARLIER APPLICATION NUMBER: 60/035,963 

; EARLIER FILING DATE: 1997-01-27 

; EARLIER APPLICATION NUMBER: 60/064,491 

; EARLIER FILING DATE: 1997-11-06 

; NUMBER OF SEQ ID NOS : 33 

; . SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 32 

LENGTH: 24 8 

TYPE: RNA 
; ORGANISM: Homo sapiens 
US-09-007-005-32 



Query Match 7.6%; Score 31.8; DB 4; Length 248; 

Best Local Similarity 19.5%; Pred. No. 0.5; 



Matches 


29; Conservative 61; Mismatches 59; Indels 0; Gaps 


Qy 


4 


aaaaataactcggaaaagaaggagacgccgaaaattcgaaaggggaggggaaagcaaagc 
: I : : : I : : | : : : | : : I : : I : : I 1 : 1 : : : 1 : 1 : : 1 : : : 


63 


Db 


33 


rarcrararururarcrararurgrgrcrurgrarargrararcrargrarararcrurg 


92 


Qy 


64 


tgatggcggaggcccaggggaaagcaaagcaaatggcggaggccccgagcaagatcgaat 
: : : : : 1 : : 1 : 1 : : 1 : 1 : : : : 1 : : 1 1 : : : 1 : 1 1 : : : 


123 


Db 


93 


rarurcrurcrurgrarargrarargrarcrcrurgrcrurgrcrgrurarararcrgru 


152 


Qy 


124 


ccatgaggaagtgggtcgtcgagcacaag 152 
I : : : I : I : : : I | : : : : I : 




Db 


153 


rcrgrurgrararcrargrcrurgrarar 181 





RESULT 8 
US-09-244-796-32 

; Sequence 32, Application US/09244796 
; Patent No. 6281344 
; GENERAL INFORMATION: 



APPLICANT : Szostak, Jack W. 
APPLICANT: Roberts, Richard W. 
APPLICANT: Liu, Rihe 

TITLE OF INVENTION: SELECTION OF PROTEINS USING RNA-PROTEIN 

TITLE OF INVENTION: FUSIONS 

FILE REFERENCE: 00786/350007 

CURRENT APPLICATION NUMBER: US/09/244,796 

CURRENT FILING DATE: 1999-02-05 

EARLIER APPLICATION NUMBER: 60/035,963 

EARLIER FILING DATE: 1997-01-27 

EARLIER APPLICATION NUMBER: 60/064,491 

EARLIER FILING DATE: 1997-11-06 

EARLIER APPLICATION NUMBER: 09/007,005 

EARLIER FILING DATE: 1998-01-14 

NUMBER OF SEQ ID NOS : 33 

SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 32 
LENGTH: 24 8 
TYPE: RNA 

ORGANISM: Homo sapiens 



Query Match 7.6%; Score 31.8; DB 4; Length 248; 

Best Local Similarity 19.5%; Pred. No. 0.5; 



Matches 


29; Conservative 61; Mismatches 59; Indels 0; Gaps 


Qy 


4 


aaaaataactcggaaaagaaggagacgccgaaaattcgaaaggggaggggaaagcaaagc 
: I : : : I : : | : : : | : : I : : I : : I I : I : : : 1 : 1 : : 1 : : : 


63 


Db 


33 


rarcrararururarcrararurgrgrcrurgrarargrararcrargrarararcrurg 


92 


Qy 


64 


tgatggcggaggcccaggggaaagcaaagcaaatggcggaggccccgagcaagatcgaat 
: : : : : 1 : : 1 : 1 : : 1 : ! : : : : 1 : : I I : : : I : I I : : : 


123 


Db 


93 


rarurcrurcrurgrarargrarargrarcrcrurgrcrurgrcrgrurarararcrgru 


152 


Qy 


124 


ccatgaggaagtgggtcgtcgagcacaag 152 
I : : : I : I : : : I I : : : : I : 




Db 


153 


rcrgrurgrararcrargrcrurgrarar 181 





RESULT 9 
US-09-007-005-3 

; Sequence 3, Application US/09007005B 

; Patent No. 6258558 

; GENERAL INFORMATION: 

; APPLICANT: Szostak, Jack W. 

; APPLICANT: Roberts, Richard W. 

; APPLICANT: Liu, Rihe 

; TITLE OF INVENTION: SELECTION OF PROTEINS USING RNA-PROTEIN 

; TITLE OF INVENTION: FUSIONS 

; FILE REFERENCE: 00786/350003 

; CURRENT APPLICATION NUMBER: US/09/007 , 005B 

; CURRENT FILING DATE: 1998-01-14 

; EARLIER APPLICATION NUMBER: 60/035/963 

; EARLIER FILING DATE: 1997-01-27 

; EARLIER APPLICATION NUMBER: 60/064,491 

; EARLIER FILING DATE: 1997-11-06 



; NUMBER OF SEQ ID NOS : 33 

SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 3 

LENGTH: 277 

TYPE: RNA 

ORGANISM: Artificial Sequence 
FEATURE: 

OTHER INFORMATION: Translation template 
US-09-007-005-3 



Query Match 7.6%; Score 31.8; DB 4; Length 277; 

Best Local Similarity 19.5%; Pred. No. 0.53; 



Matches 


29; Conservative 61; Mismatches 59; Indels 0; Gaps 


Qy 


4 


aaaaataactcggaaaagaaggagacgccgaaaattcgaaaggggaggggaaagcaaagc 
: | : : : | : : | : : : | : : | : : | : : | | : | : : : | : | : : | : : : 


63 


Db 


33 


rarer ararururarcrararurgrgrcrurgrarargrararcrargrarararcrurg 


92 


Qy 


64 


tgatggcggaggcccaggggaaagcaaagcaaatggcggaggccccgagcaagatcgaat 
: : : : : 1 : : 1 : 1 : : 1 : 1 : : : : 1 : : I I : : : I : I I : : : 


123 


Db 


93 


rarurcrurcrurgrarargrarargrarcrcrurgrcrurgrcrgrurarararcrgru 


152 


Qy 


124 


ccatgaggaagtgggtcgtcgagcacaag 152 
| : : : | : | : : : | | : : : : | : 




Db 


153 


rcrgrurgrararcrargrcrurgrarar 181 





RESULT 10 
US-09-244-796-3 

; Sequence 3, Application US/09244796 

; Patent No. 6281344 

; GENERAL. INFORMATION: 

; APPLICANT: Szostak, Jack W. 

; APPLICANT: Roberts, Richard W. 

; APPLICANT: Liu, Rihe 

; TITLE OF INVENTION: SELECTION OF PROTEINS USING RNA-PROTEIN 

; TITLE OF INVENTION: FUSIONS 

; FILE REFERENCE: 00786/350007 

; CURRENT APPLICATION NUMBER: US/09/24 4,7 96 

; CURRENT FILING DATE: 1999-02-05 

; EARLIER APPLICATION NUMBER: 60/035,963 

; EARLIER FILING DATE: 1997-01-27 

; EARLIER APPLICATION NUMBER: 60/064,491 

; EARLIER FILING DATE: 1997-11-06 

; EARLIER APPLICATION NUMBER: 09/007,005 

; EARLIER FILING DATE: 1998-01-14 

; NUMBER OF SEQ ID NOS: 33* 

; SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 3 

LENGTH: 27 7 

TYPE: RNA 
; ORGANISM: Artificial Sequence 

FEATURE: 

OTHER INFORMATION: Translation template 



Query Match 7.6%; Score 31.8/ DB 4; Length 277; 

Best Local Similarity 19.5%; Pred. No. 0.53; 

Matches 29; Conservative 61; Mismatches 59; Indels 0; Gaps 



0; 



Qy 


4 


aaaaataactcggaaaagaaggagacgccgaaaattcgaaaggggaggggaaagcaaagc 
: | : : : | : : I : ' : : I : : I : : 1 : : I | : I : : : I : I : : I : : : 


63 


Db 


33 


rarcrararururarcrararurgrgrcrurgrarargrararcrargrarararcrurg 


92 


Qy 


64 


tgatggcggaggcccaggggaaagcaaagcaaatggcggaggccccgagcaagatcgaat 
: : : : : 1 : : 1 : 1 : : I : I : : : : 1 : : 1 1 : : : I : I I : : : 


123 


Db 


93 


rarurcrurcrurgrarargrarargrarcrcrurgrcrurgrcrgrurarararcrgru 


152 


Qy 


124 


ccatgaggaagtgggtcgtcgagcacaag 152 
| : : : I : I : : : | I : : : : I : 




Db 


153 


rcrgrurgrararcrargrcrurgrarar 181 





RESULT 11 

PCT-US 91-0942 2-2 0/c 

Sequence 20, Application PC/TUS9109422 
GENERAL INFORMATION: 

APPLICANT: Mulvihill, Eileen R. 
APPLICANT: Hagen, Frederick S. 
APPLICANT: Houamed, Khaled M. 
APPLICANT: Aimers, Wolf hard 

TITLE OF INVENTION: G PROTEIN-COUPLED GLUTAMATE RECEPTORS 
NUMBER OF SEQUENCES: 33 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Townsend and Townsend 
STREET: One Market Plaza, Steuart Street Tower 
CITY: San Francisco 
STATE: California 
COUNTRY: USA 
ZIP: 94105-1492 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 
SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: PCT/US 91 /0 94 2 2 
FILING DATE: 19911212 
CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/672,007 
FILING DATE: 18-MAR-1991 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/648,481 
FILING DATE: 30-JAN-1991 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/626,806 
FILING DATE: 12-DEC-1990 
ATTORNEY/AGENT INFORMATION: 
NAME: Parmelee, Steven W. 
REGISTRATION NUMBER: 31,990 
REFERENCE/DOCKET NUMBER: 13952-6PC 



TELECOMMUNICATION INFORMATION: 
TELEPHONE: (206) 467-9600 
TELEFAX: (415) 543-5043 
; INFORMATION FOR SEQ ID NO: 20: 

SEQUENCE CHARACTERISTICS: 
LENGTH: 2426 base pairs 
TYPE: NUCLEIC ACID 
STRANDEDNESS: single 
TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

IMMEDIATE SOURCE: 
CLONE: SRI 3 
PCT-US91-09422-20 



Query Match 7.6%; Score 31.8; DB 5; Length 2426; 

Best Local Similarity 52.7%; Pred. No. 1.4; 

Matches 69; Conservative 0; Mismatches 62; Indels 0; Gaps 0; 

Qy 16 gaaaagaaggagacgccgaaaattcgaaaggggaggggaaagcaaagctgatggcggagg 75 

III I I I I I I I I I I I I I I I I I I I I M I II M I I I 
Db 2368 GAGATCGAGGGGGAACAGAAGAGAAAATGATGGCGGGGATGGGGCAGCTCAGGGAGGAGG 2309 

Qy 76 cccaggggaaagcaaagcaaatggcggaggccccgagcaagatcgaatccatgaggaagt 135 

I I I I I I II II I I I I I I I I III II II II I I I 

Db 2308 GGCAGGGGGCAGGGCAGATGGCAGGGGAAGGCCAGGAGCAGAATGGAAAAATATGGAAGA 22 4 9 



Qy 136 gggtcgtcgag 146 

III I II 
Db 224 8 GGATGATGTAG 2238 



RESULT 12 
US-08-147-777-3/C 

Sequence 3, Application US/08147777 
Patent No. 5914265 
GENERAL INFORMATION: 

APPLICANT: Roop, Dennis R. 
APPLICANT: Rothnagel, Joseph A. 
APPLICANT: Greenhalgh, David A. 
APPLICANT: Yuspa, Stuart H. 

TITLE OF INVENTION: KERATIN Kl EXPRESSION VECTORS 
TITLE OF INVENTION: AND METHODS OF USE 
NUMBER OF SEQUENCES: 5 
CORRESPONDENCE ADDRESS: 
ADDRESSEE: LYON & LYON 
STREET: 611 West Sixth Street 
CITY: Los Angeles 
STATE: California 
COUNTRY: U.S.A. 
ZIP : 90017 
COMPUTER READABLE FORM: 

MEDIUM TYPE: 3.5" Diskette, 1.44 Mb storage 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: IBM MS-DOS (Version 5.0) 
SOFTWARE: WordPerfect (Version 5.1) 
CURRENT APPLICATION DATA: 



APPLICATION NUMBER: US/08/147,777 
FILING DATE: 
CLASSIFICATION: 800 
PRIOR APPLICATION DATA: 

PRIOR APPLICATION DATA: including application 

PRIOR APPLICATION DATA: described below: two 

APPLICATION NUMBER: 07/876,289 

FILING DATE: April 30, 1992 

APPLICATION NUMBER: Unassigned (204/144) 

FILING DATE: October 29, 1993 
ATTORNEY/AGENT INFORMATION: 

NAME: Warburg, Richard J. 

REGISTRATION NUMBER: 32,327 

REFERENCE/DOCKET NUMBER: 204/153 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (213) 489-1600 

TELEFAX: (213) 955-0440 

TELEX: 67-3510 
INFORMATION FOR SEQ ID NO: 3: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 24979 base pairs 

TYPE: nucleic acid 

STRANDEDNESS: single 

TOPOLOGY: linear 
MOLECULE TYPE: DNA (genomic) 



Query Match 7.5%; Score 31.4; DB 2; Length 24979; 

Best Local Similarity 54.9%; Pred. No. 5.5; 

Matches 62; Conservative 0; Mismatches 51; Indels 0; Gaps 

Qy 15 ggaaaagaaggagacgccgaaaattcgaaaggggaggggaaagcaaagctgatggcggag 74 

Ml II I I I I I II I I II I I I I I I I 

Db 6996 GGAGGAGGAGGAGGAGCGCGGAGGAGGAGCCAGAGGAGCAAGGAGAAGCAGAGGAGGGAG 6937 

Qy 75 gcccaggggaaagcaaagcaaatggcggaggccccgagcaagatcgaatccat 127 

I I I M I I II I I I I I I I I I M I I I I I M I I I 

Db 6936 AAGAGGAGGAAGGCAAAGGAGAAGAAGGAGAAGAAGAAGAAGAAATAATGCAT 6884 



RESULT 13 
US-08-452-872-3/C 

Sequence 3, Application US/08452872 
Patent No.. 6057298 
GENERAL INFORMATION: 

APPLICANT: Roop, Dennis R. 
APPLICANT: Rothnagel, Joseph A. 
APPLICANT: Greenhalgh, David A. 
APPLICANT: Yuspa, Stuart H. 

TITLE OF INVENTION: KERATIN Kl EXPRESSION VECTORS 
TITLE OF INVENTION: AND METHODS OF USE 
NUMBER OF SEQUENCES: 5 
CORRESPONDENCE ADDRESS: 
ADDRESSEE: LYON & LYON 
STREET: 611 West Sixth Street 
CITY: Los Angeles 



; STATE : California 

COUNTRY: U.S.A. 

ZIP: 90017 
COMPUTER READABLE FORM: 

MEDIUM TYPE: 3.5" Diskette, 1.44 Mb storage 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: IBM MS-DOS (Version 5.0) 

SOFTWARE: WordPerfect (Version 5.1) 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/452,872 

FILING DATE: 

CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US/08/147 , 111 
; ■ FILING DATE: 

APPLICATION NUMBER: 07/876,289 

FILING DATE: April 30, 1992 

APPLICATION NUMBER: Unassigned (204/144) 

FILING DATE: October 29, 1993 
ATTORNEY/AGENT INFORMATION: 
; NAME: Warburg, Richard J. 

REGISTRATION NUMBER: 32,327 

REFERENCE/DOCKET NUMBER: 204/153 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (213) 489-1600 

TELEFAX: (213) 955-0440 

TELEX: 67-3510 
; INFORMATION FOR SEQ ID NO: 3: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 24 979 base pairs 

TYPE: nucleic acid 

STRANDEDNESS: single 

TOPOLOGY: linear 
MOLECULE TYPE: DNA (genomic) 
US-08-452-872-3 



Query Match 7.5%; Score 31.4; DB 3; Length 24979; 

Best Local Similarity 54.9%; Pred. No. 5.5; 

Matches 62; Conservative 0; Mismatches 51; Indels 0; Gaps 0; 

Qy 15 ggaaaagaaggagacgccgaaaattcgaaaggggaggggaaagcaaagctgatggcggag 74 

III I I I I I I I II I II I I I I I I I I I I I I I I I I I 
Db 6996 GGAGGAGGAGGAGGAGCGCGGAGGAGGAGCCAGAGGAGCAAGGAGAAGCAGAGGAGGGAG 6937 

Qy 75 gcccaggggaaagcaaagcaaatggcggaggccccgagcaagatcgaatccat 127 

I I M I II I I I I I I I I I I I II I I I I I I I I I I 

Db 6936 AAG AG GAG G AAG G C AAAG G AG AAG AAG G AG AAG AAG AAG AAGAAAT AAT G C AT 6884 



RESULT 14 
PCT-US93-03985-3/C 

; Sequence 3, Application PC/TUS9303985 
; GENERAL INFORMATION: 

APPLICANT: Roop, Dennis R. 

APPLICANT: Rothnagel, Joseph A. 

APPLICANT: Greenhalgh, David A. 



APPLICANT: Yuspa, Stuart H. 

TITLE OF INVENTION: DEVELOPMENT OF A VECTOR TO TARGET GENE 
TITLE OF INVENTION: EXPRESSION TO THE EPIDERMIS OF TRANSGENIC ANIMALS 
NUMBER OF SEQUENCES: 5 
CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Fulbright & Jaworski 

STREET: 1301 McKinney, Suite 5100 

CITY: Houston 

STATE: Texas 

COUNTRY: U.S.A. 

ZIP: 77010-3095 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: PCT/US93/03985 

FILING DATE: 19930428 

CLASSIFICATION: 
ATTORNEY/ AGENT INFORMATION: 
; NAME: Paul, Thomas D. 

REGISTRATION NUMBER: 32,714 

REFERENCE/DOCKET NUMBER: D-5478 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: 713/651-5325 
; # TELEFAX: 713/651-5246 
; * TELEX: 762829 

; INFORMATION FOR SEQ ID NO: 3: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 24979 base pairs 

TYPE: NUCLEIC ACID 

STRANDEDNESS: double 

TOPOLOGY: linear 
MOLECULE TYPE: DNA (genomic) 
HYPOTHETICAL: NO 
ANTI-SENSE: NO 
PCT-US93-03985-3 



Query Match 7.5%; Score 31.4; DB 5; Length 24 97 9; 

Best Local Similarity 54.9%; Pred. No. 5.5; 

Matches 62; Conservative 0; Mismatches 51; Indels 0; Gaps 0; 

Qy 15 ggaaaagaaggagacgccgaaaattcgaaaggggaggggaaagcaaagctgatggcggag 7 4 

III II I I I I I II' I II I I I I I I I I I I I I I I I I I 
Db 6996 GGAGGAGGAGGAGGAGCGCGGAGGAGGAGCCAGAGGAGCAAGGAGAAGCAGAGGAGGGAG 6937 

-Qy 75 gcccaggggaaagcaaagcaaatggcggaggccccgagcaagatcgaatccat 127 

I I I I I I I I I I I I I I I I I I II I I I I III ill 

Db 6936 AAGAGGAGGAAGGCAAAGGAGAAGAAGGAGAAGAAGAAGAAGAAATAATGCAT 6884 



RESULT 15 

5196516-7/c 

; Patent No. 5196516 

APPLICANT: SCHREURS, CHRISTA S . ; METTENLEITER, THOMAS C. 



SIMON, ARTUR J.;LUKAS, NOEMI ; RZIHA, HANNS J. 

TITLE OF INVENTION: PSEUDORABIES VIRUS VACCINE 
NUMBER OF SEQUENCES: 8 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/07/383,833 
FILING DATE: 21-JUL-1989 
SEQ ID NO: 7: 

LENGTH: 4897 
5196516-7 



Query Match 7.3%; Score 30.8; DB 6; Length 4 8 97; 

Best Local Similarity 50.7%; Pred. No. 4.1; 

Matches 74; Conservative 0; Mismatches 72; Indels 0; Gaps 



0; 



Qy 14 cggaaaagaaggagacgccgaaaattcgaaaggggaggggaaagcaaagctgatggcgga 73 

I I II II I -I I I I I I I I I I I I I I I I I I I II I I I 

Db 4 37 9 CGGAGGGGAGAGGGACGGAGGGGAGAGGGACGGGGGGGGGAGAGGGACGGAGGGGAGAGG 4 320 

Qy 74 ggcccaggggaaagcaaagcaaatggcggaggccccgagcaagatcgaatccatgaggaa 133 

II I I I I I I I I II II I I I I I I I I II I I I I 

Db 4 319 GACGGAGGGGAGAGGGACGGGGGGGGGAGAGGGACGGAGGGGAGAGGGACGGGGGGGGGA 4 2 60 

Qy 134 gtgggtcgtcgagcacaagctccgag 159 

I I I I I I I I I I I I I I I 
Db 4 259 GAGGGACGGAGGGGAGAGGGACGGAG 4234 



Search completed: February 7, 2002, 10:51:34 
Job time: 6060 sec 

GenCore version 4.5 

Copyright (c) 1993 - 2000 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 
Run on: 



February 7, 2002, 03:18:43 ; Search time 4942.22 Seconds 

(without alignments) 
915.373 Million cell updates/sec 



Title: 

Perfect score: 
Sequence : 



US-09-394-745-5893 
421 

1 gaaaaaaataactcggaaaa ccatgttggttcctgcatgc 421 



Scoring table: 



IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 



Searched : 



11351937 seqs, 5372889281 residues 



Total number of hits satisfying chosen parameters: 



22703874 



Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 



Post-processing: Minimum Match 0% 



Maximum Match 100% 
Listing first 45 summaries 



Database : EST:* 

1: em_estfun:* 

2: em_esthum: * 

3: em__estin:* 

4: em_estom:* 

5: em_estpl:* 

6: em_estba:* 

7 : em_estro : * 

8: em_estov: * 

9 : em_htc : * 

10: gb_estl:* 

11: gb_est2:* 

12: gb_htc:* 

13: gb_gss:* 

14: em_gss_fun:* 

1 5 : ern_gs s_hum : * 

16: em__gss_inv : * 

17: em_gss_pln:* 

18: em_gss_pro:* 

19: em_gss_rod : * 

20: em_gss_vrt:* 

21: em_gss_other : * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 

% 

Result Query 



Mo. 


Score 


Match 


Length 


DB 


ID 


Description 


1 


358.2 


85 


1 


456 


11 


BF317968 


BF317968 OV1 10 B0 


2 


297 . 6 


70 


7 


508 


10 


AW424866 


AW424866 660039G12 


3 


282.8 


67 


2 


561 


10 


AW453226 


AW453226 660033E05 


4 


248.6 


59 


0 


396 


10 


BE429089 


BE429089 MTD014 .CO 


5 


237.6 


56 


4 


649 


10 


AU162606 


AU162606 AU162606 


6 


228.8 


54 


3 


534 


10 


AU056970 


AU056970 AU056970 


7 


219.8 


52 


2 


517 


10 


AU070966 


AU070966 AU070966 


8 


217.2 


51 


6 


574 


10 


AW352644 


AW352644 660033E05 


9 


212.2 


50 


4 


464 


11 


D49054 


D49054 RICS15677A 


10 


211 


50 


1 


431 


11 


D22742 


D22742 RICC1155A R 


11 


193.4 


45 


9 


422 


11 


C97246 


C97246 C97246 Rice 


12 


167 . 6 


39 


8 


231 


11 


BF317828 


BF317828 OV1 10 B0 


13 


165.4 


39 


3 


615 


10 


AI967165 


AI967165 614006G06 


14 


165.2 


39 


2 


242 


11 


BF203045 


BF203045 WHE1768 G 


15 


163 


38 


7 


352 


11 


C97312 


C97312 C97312 Rice 


16 


162 


38 


5 


* 398 


11 


C27791 


C27791 C27791 Rice 


17 


156 


37 


1 


379 


11 


D22974 


D22974 RICC1913A R 


18 


147.6 


35 


1 


459 


11 


C28034 


C28034 C28034 Rice 


19 


147 . 4 


35 


.0 


473 


11 


BG157171 


BG157171 sab23d06. 


20 


145.4 


34 


.5 


502 


11 


BF596766 


BF596766 su62d07,y 


21 


145.2 


34 


.5 


334 


10 


AI988618 


AI988618 sd05d06.y 


22 


145.2 


34 


.5 


440 


11 


BG882885 


BG882885 sae78h07. 





23 


145 . 


. 2 


34 , 


. 5 


458 


10 


AW734 667 


AW734 667 


sk97h06 . y 




24 


145 , 


. 2 


34 , 


. 5 


4 64 


11 


BE804 609 


BE804 609 


sr84b03 . y 


c 


25 


14 5 , 


. 2 


34 . 


. 5 


680 


10 


AW34 8125 


AW34 8125 


GM210001A 


c 


26 


142 , 


, 4 


33 , 


. 8 


54 4 


10 


All 11 660 


AI7 1 1660 


605058H02 




27 


142 , 


, 4 


33 , 


. 8 


559 


10 


AW746168 


AW7 4 6168 


WS1 39 B0 


c 


28 


142 , 


. 4 


33 . 


. 8 


569 


11 


BG266936 


BG266936 


i r\ r\ f\ i a n ti n 

1000109A0 




29 


14 2 , 


, 4 


33 , 


. 8 


590 


10 


BE3604 54 


BE360454 


DG1 63 Gl 


c 


30 


14 2. 


, 4 


33 , 


. 8 


600 


10 


AI673899 


AI673899 


605038G11 




31 


14 2. 


, 4 


33 , 


. 8 


62 1 


10 


BE360398 




DG1 63 Gl 




32 


142 . 


, 4 


33 , 


. 8 


636 


10 


AW565H0 


AW565110 


LG1 323 C 




33 


139 . 


. 8 


33 . 


. 2 


479 


10 


BE445786 


BE445786 


WHE1453 G 


c 


34 


139, 


.8 


33 . 


. 2 


505 


11 


BG312631 


BG312631 


WHE2456 A 




35 


138, 


.8 


33 . 


. 0 


473 


11 


BF473251 


BF473251 


WHE0926 C 




36 


136, 


.6 


32 . 


. 4 


861 


10 


BE558473 


BE558473 


HV_CEb001 




37 


136. 


.6 


32 , 


. 4 


941 


10 


BE454416 


BE454416 


HVSMEh009 




38 


135. 


.8 


32 , 


. 3 


448 


11 


T18288 


T18288 5c06bl0-t7 


c 


39 


135. 


A 


32 . 


. 2 


671 


11 


BG874201 


BG874201 


MEST47-E0 




40 


135 


32. 


.1 


276 


11 


D43130 


D43130 D43130 Rice 




41 


135 


32. 


.1 


503 


11 


BF626192 


BF626192 


HVSMEaOOl 




42 


135 


32. 


, 1 


521 


10 


AL505327 


AL505327 


AL505327 




43 


135 


32. 


.1 


557 


10 


AL504875 


AL504875 


AL504875 




44 


134. 


A 


31. 


.9 


887 


11 


BG365949 


BG365949 


HVSMEiOOO 




45 


133, 


, 4 


31. 


.7 


621 


11 


BF473588 


BF473588 


WHE0930 B 



ALIGNMENTS 



RESULT 1 

BF317968 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

TITLE 
JOURNAL 
COMMENT 



BF317968 456 bp mRNA EST 21-NOV-2000 

OVl_10_B03.bl_A002 Ovary 1 (0V1) Sorghum bicolor cDNA, mRNA 
sequence . 
BF317968 

BF317968.1 GI:11266505 
EST. 

sorghum. 
Sorghum bicolor 

Eukaryota ; Viridiplantae ; St reptophyta ; Embryophyta ; Tracheophyta ; 
Spermatophyta; Magnoliophyta; Liliopsida; Poales; Poaceae; PACC 
clade; Panicoideae; Andropogoneae; Sorghum. 
1 (bases 1 to 456) 

Cordonnier-Pratt,M.-M. , Gingle,A., Marsala, C, Sudman,M. and Pratt 
, L.H. 

An EST database from Sorghum: ovaries of varying immature stages 

Unpublished (2000) 

Contact: Cordonnier-Pratt MM 

Department of Botany 

The University of Georgia 

Plant Sciences Building, Rm. 2502, Athens, GA 30602-7271, USA 
Tel: 706 542 1860 
Fax: 706 542 1805 
Email : mmpratt@uga . edu 

Sequences have been trimmed to exclude PolyA, vector and regions 
below Phred quality 16. The threshold for highest quality sequence 
is 20. 

Seq primer: JEN REV 



FEATURES 

source 



BASE COUNT 
ORIGIN 



High quality sequence stop: 423 
POLYA=No . 

Location/Qualifiers 
1. .456 

/organism="Sorghum bicolor" 
/db_xref="taxon: 4558" 
/clone_lib="Ovary 1 (OV1)" 

/note="Organ : Mix of ovaries of varying immature stages 
from 8-week-old plants; Vector: pBluescript II from Lambda 
Zap II; Site_l: Xhol; Site_2: EcoRI; The library was made 
from poly-A RNA in the cloning vector lambda ZAP II. 
Clones to be sequenced were prepared by mass excision." 
138 a 103 c 125 g 89 t 1 others 



Query Match 85.1%; Score 358.2; DB 11; Length 456; 

Best Local Similarity 94.0%; Pred. No. 2.5e-80; 

Matches 394; Conservative 0; Mismatches 23; Indels 2; Gaps 2; 

Qy 4 aaaaataactcggaaaagaaggagacgccgaaaattcgaaaggggaggggaaagcaaagc 63 

I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I I I III I I I I I I I I I I I I I I I 
Db 2 AAAAATAACTCCGAAAAGAAAGAGACGCCGAAAATTCGAAGGGGAAGGGGAAAGCAAAGC 61 

Qy 64 tgatggcggaggcccaggggaaagcaaagcaaatggcggaggccccgagcaagatcgaat 12 3 

I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I I I I I M I I I 
Db 62 AAATGGCGGAGGCCGAGGGAAAAGCAAAGCAAATGGCGGAGGGCCCGAGCAAGATCGAAT 121 

Qy 124 ccatg-aggaagtgggtcgtcgagcacaagctccgagccgtaggttgcctctggctaggt 182 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 122 CCATGAACTAAGTGGGTCGTCGACCACAAGCTCCGAGCCGTAGGTTGCCTCTGGCTAGGT 181 

Qy 183 gggatcagcagttcgatcgcctacaactggtcgcggcccaatatgaagcctagcgtcaag 24 2 

I II I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I I I I 
Db 182 GGGATCAGCAGTTCGATCGCCTACAACTGGTCGCGGCCCAATATGAAGACTAGCGTCAAG 241 

Qy 243 atcatccacgcaaggttgcatgctcaagctctaaccctggctgcattagttggttctgca 302 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 
Db 242 ATCATCCACGCAAGGTTGCATGCACAAGGTCTAACCCTAGCTGCATTAGTTGGTTCTGCA 301 

Qy 303 tgcgtggagtactatgaccagaagtatggttcttctgggccaaaggtggacaaatacaca 3 62 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I 
Db 302 TGCGTGGAGTACTATGACAATAAGTATGGTTCTTCTGGGCCAAAGGTGGACAAATACACA 361 

Qy 363 agccaatacctggcccattcccataaagattaaaggtcgccatgttggttcctgcatgc 421 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III II I I I I I I I I I I I I I I I II I I 
Db 362 AGCCAATACCTGGCCCATGCGCATAAAGATT-AAGATCTCCATGTTGGTTCCTGCATAC 419 



RESULT 2 
AW424866/C 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 



AW424866 508 bp mRNA EST 09-FEB-2000 

660039G12.xl 660 - Mixed stages of anther and pollen Zea mays cDNA, 
mRNA sequence. 
AW424866 

AW424866.1 GI:6952798 
EST. 



SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
COMMENT 



FEATURES 

source 



Embryophyta ; Tracheophyta ; 
; Poales; Poaceae; PACC 



BASE COUNT 
ORIGIN 



Zea mays. 
Zea mays 

Eukaryota; Viridiplantae; Streptophyta; 
Spermatophyta; Magnoliophyta; Liliopsida 
clade; Panicoideae; Andropogoneae; Zea. 
1 (bases 1 to 508) 
Walbot, V. 

Maize ESTs from various cDNA libraries sequenced at Stanford 

University 

Unpublished (1999) 

Contact: Walbot V 

Department of Biological Sciences 

Stanford University 

855 California Ave, Palo Alto, CA 94304, USA 

Tel: 650 723 2227 

Fax: 650 725 8221 

Email : walbot@stanford.edu 

Plate: 660039 row: G column: 12. 

Location/Qualifiers 

1. .508 

/organism="Zea mays" 
/cultivar="Ohio43" 
/db_xref ="taxon : 4 577 " 
/clone_lib=" 660 - Mixed stages 
/tissue_type="whole premieotic 
/dev_stage="premieotic anthers 
/labJiost="XLOLR" 
/note="0rgan : anthers; Vector: 
Site_2: Xhol; Anther and pollen 
Directionally sequenced with 5 ! 
Created by Amie Franklin." 
118 a 140 c 107 g 141 t 2 others 



of anther and pollen" 
anthers to pollen shed" 
to pollen shed" 

Lambda Zap; Site_l: EcoRI; 
cDNA library, 
end at the EcoRI site. 



Query Match 70.7%; 
Best Local Similarity 88.8%; 
Matches 332; Conservative 



Score 297.6; DB 10; 
' Pred. No. 4 .8e-65; 
0; Mismatches 41; 



Length 508; 
Indels 1; 



Gaps 



l; 



Qy 4 2 aaaggggaggggaaagcaaagctgatggcggaggcccaggggaaagcaaagcaaatggcg 101 

I I I I I I I II I I II I I I I I I I I I I I I 1 I I I I I I I I 

Db 506 AGAGGAGACGCCGAAAATTCGAAGGGGGAAGAGGAGAAGAGGGGAAAGAAGCAAATGGCG 44 7 

Qy 102 gaggccccgagcaagatcgaatccatgaggaagtgggtcgtcgagcacaagctccgagcc 161 

III I I I I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 44 6 GAGTCCCCGAGCAAGATCGAATCCATGAGGAAGTGGGTCGTCGAGCCCAAGCTCCGAGCC 387 

Qy 162 gtaggttgcctctggctaggtgggatcagcagttcgatcgcctacaactggtcgcggccc 221 

I I I I I I I I I I I I I I I I I II I I I I I I ! I I I I I I I II I I I I I I I II I II I II I I I I I I I 
Db 386 GTAGGTTGCCTCTGGCTAGGTGGGATCAGCAGTTCGATCGCATNCANCTGGTCGCGGCCC 327 



Qy 222 aatatgaagcctagcgtcaagatcatccacgcaaggttgcatgctcaagctctaaccctg 281 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 
Db 32 6 AATATGAAGACTAGCGTCAAGATCATCCACGCAAGGTTGCATGCGCAGGCTCTAACCCTA 267 



Qy 



282 gctgcattagttggttctgcatgcgtggagtactatgaccagaagtatggttcttctggg 341 
I I I I I I I II I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 



Db 2 66 GCTGCATTAGTTGGTTCTGCATGCGTGGAGTACTACGACCAGAAGTATGGTTCTTCTGGG 207 

Qy 342 ccaaaggtggacaaatacacaagccaatacctggcccattcccataaagattaaaggtcg 401 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I ! I I I I I I I I I I I I I I I I I 
Db 20 6 CCAAAGGTGGACAAGTACACAAGCCAATATCTGGCCCATTCGCATAAAGATT-AAGGTCC 14 8 

Qy 402 ccatgttggttcct 415 

I I I I I I I I II I I I 
Db 14 7 CCATGTTGGTTCAT 134 



RESULT 3 
AW453226/C 

LOCUS AW453226 561 bp mRNA EST 17-FEB-2000 

DEFINITION 660033E05.yl 660 - Mixed stages of anther and pollen Zea mays cDNA, 

mRNA sequence. 
ACCESSION AW453226 
VERSION AW453226.1 GI:6994012 

KEYWORDS EST. 
SOURCE Zea mays . 

ORGANISM Zea mays 

Eukaryota; Viridiplantae ; Streptophyta; Embryophyta; Tracheophyta ; 

Spermatophyta; Magnoliophyta; Liliopsida; Poales; Poaceae; PACC 

clade; Panicoideae; Andropogoneae; Zea. 
REFERENCE 1 (bases 1 to 561) 
AUTHORS Walbot,V. 

TITLE Maize ESTs from various cDNA libraries sequenced at Stanford 

University 
JOURNAL Unpublished (1999) 
COMMENT Contact: Walbot V 

Department of Biological Sciences 

Stanford University 

855 California Ave, Palo Alto, CA 94304, USA 
Tel: 650 723 2227 
Fax: 650 725 8221 
Email: walbot@stanford.edu 
Plate: 660033 row: E column: 05. 
FEATURES Locat ion/Quali f iers 

source 1. .561 

/organism=" Zea mays" 

/cultivar="Ohio43" 

/db_xref="taxon:4577" 

/clone_lib="660 - Mixed stages of anther and pollen" 
/tissue_type="whole premieotic anthers to pollen shed" 
/dev_stage="premieotic anthers to pollen shed" 
/lab_host="XLOLR" 

/note="Organ : anthers; Vector: Lambda Zap; Site_l: EcoRI; 

Site_2: Xhol; Anther and pollen cDNA library. 

Directionally sequenced with 5 1 end at the EcoRI site. 

Created by Amie Franklin." 
BASE COUNT 160 a 132 c 115 g 154 t 

ORIGIN 



Query Match 67.2%; Score 282.8; DB 10; Length 561; 

Best Local Similarity 95.9%; Pred. No. 2.6e-61; 

Matches 301; Conservative 0; Mismatches 12; Indels 1; Gaps 1; 



Qy 102 gaggccccgagcaagatcgaatccatgaggaagtgggtcgtcgagcacaagctccgagcc 161 

III I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I 
Db 561 GAGTCCCCGAGCAAGATCGAATCCATGAGGAAGTGGGTCGTCGAGCACAAGCTCCGAGCC 502 

Qy 162 gtaggttgcctctggctaggtgggatcagcagttcgatcgcctacaactggtcgcggccc 221 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I III I I I I I I I I I I I I I I I I I I I I 
Db 501 GTAGGTTGCCTCTGGCTAGGTGGGATCAGCAGTTCGATCGCATACAACTGGTCGCGGCCC 4 42 



Qy 222 aatatgaagcctagcgtcaagatcatccacgcaaggttgcatgctcaagctctaaccctg 281 

I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I II I I I I I I I I II I I I I I I I I I II 
Db 441 AATATGAAGACTAGCGTCAAGATCATCCACGCAAGGTTGCATGCGCAGGCTCTAACCCTA 382 



Qy 282 gctgcattagttggttctgcatgcgtggagtactatgaccagaagtatggttcttctggg 341 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 
Db 381 GCTGCATTAGTTGGTTCTGCATGCGTGGAGTACTACGACCAGAAGTATGGTTCTTCTGGG 322 



Qy 34 2 ccaaaggtggacaaatacacaagccaatacctggcccattcccataaagattaaaggtcg 4 01 

I I I I I I I I II I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 
Db 321 CCAAAGGTGGACAAGTACACAAGCCAATATCTGGCCCATTCGCATAAAGATT-AAGGTCC 2 63 



Qy 402 ccatgttggttcct 415 

I I I I II I I I I I I I 
Db 2 62 CCATGTTGGTTCAT 24 9 



RESULT 4 

BE429089 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
COMMENT 



BE429089 396 bp mRNA EST 26-JUL-2000 

MTD014 .C06F990624 ITEC MTD Durum Wheat Root Library Triticum 
turgidum subsp. durum cDNA clone MTD014.C06, mRNA sequence. 
BE429089 

BE429089.1 GI:9426932 
EST. 

durum wheat . 

Triticum turgidum subsp. durum 

Eukaryot a ; Vir idiplant ae ; St reptophyta ; Embryophyta ; Tracheophyt a ; 
Spermatophyta; Magnoliophyta; Liliopsida; Poales; Poaceae; Pooideae 
; Triticeae; Triticum. 
1 (bases 1 to 396) 

Anderson, 0. A. , .Appels,R., Bailey, P., Blake, T., Close, T., Cloutier 
,S., Dubcovsky, J. , Feuillet,C, Gale,M., Graner,A., Gustaf son, P . , 
Herrmann, R. G. , Holton,T., Jacquemin, J . M . , Jia,J., Joudrier,P., 
Langridge, P . , Lazo,G.R., Lin, J. J., McGuire,P., Ogihara,Y., 
Pecchioni, N . , Qualset,C, Schuch,W., Selvaraj,G., Sharif lou, M. , 
Sorrells,M., Warburton,M. and Wenzel,G. 

International Triticeae EST Cooperative (ITEC) : Production of 
Expressed Sequence Tags for Species of the Triticeae 
Unpublished (2000) 
Contact: Joudrier P 

INRA, Unite de Biochimie et Biologie Moleculaire des Cereales 

2, place VIALA, 34060 Montpellier cedex 01 FRANCE 

Tel: 33 4 99 61 23 84 

Fax: 33 4 99 61 23 48 

Email : j oudrier@ensam . inra . f r 

International Triticeae EST Cooperative (ITEC) 
http : //wheat . pw . usda . gov/genome . 



FEATURES Location /Qualifiers 

source 1. .396 

/organism="Triticum turgidum subsp. durum" 
/cultivar="Siliana" 
/db_xref="taxon: 4567" 
/clone="MTD014 .C06" 

/clone_lib="ITEC MTD Durum Wheat Root Library" 
/tissue_type="root " 

/dev_stage=" 3 -day-old seedling, water-stressed" 
/note="Vector : pSPORTl; T7 primers used. See pSPORTl 
polylinker site. 0.3-2.0 Kbp average insert size." 

BASE COUNT 113 a 98 c 111 g 74 t 

ORIGIN 



Query Match 59.0%; Score 248.6; DB 10; Length 396; 

Best Local Similarity 81.8%; Pred. No. l.le-52; 

Matches 287; Conservative 0; Mismatches 64; Indels 0; Gaps 0; 

Qy 69 gcggaggcccaggggaaagcaaagcaaatggcggaggccccgagcaagatcgaatccatg 128 

II I II I I II I I I I I I I I I I I I I I I II I I I III I I I I I I I I I I I I I I 

Db 4 6 GCAGCGGAAACAGAGGCGGCGAAGCAAATGGCGGAGGCCCCAAGCCAGATCGAATCCATG 105 

Qy 129 aggaagtgggtcgtcgagcacaagctccgagccgtaggttgcctctggctaggtgggatc 188 

! I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I II I I M I I I I I I I I I I I I 
Db 106 CGGAAGTGGGTGGTCGATCACAAGCTCCGAGCCGTAGGTTGCCTGTGGCTTAGCGGGATC 165 

Qy 189 agcagttcgatcgcctacaactggtcgcggcccaatatgaagcctagcgtcaagatcatc 24 8 

III II I I I I I I I I I I I I I I I I II I I ! I I I I I I I I I I I I I I II I I I I I I I I I 
Db 166 TCCAGCTCCATCGCGTACAACTGGTCGCGGCCCAACATGAAGACCAGCGTCAAGCTCATC 225 

Qy 24 9 cacgcaaggttgcatgctcaagctctaaccctggctgcattagttggttctgcatgcgtg 308 

I I I I I I I I II I I I I I I I 11111111111 I I I II I I I I I I III I I I I I II 
Db 226 CACGCAAGGTTGCATGCGCAAGCTCTAACGATCGCTGCCTTAGGTAGTTGTGCATTAGTA 285 

Qy 309 gagtactatgaccagaagtatggttcttctgggccaaaggtggacaaatacacaagccaa 368 

I I I I I M I I I I I I I I I II I I I I I I II II I I I I I M M I I I I I I I I Mill II 
Db 286 GAGTACTATGAACAGAACTACGGTTCTTCAGGACCAAAGGTGGACAAATATACAAGGCAT 34 5 

Qy 369 tacctggcccattcccataaagattaaaggtcgccatgttggttcctgcat 419 

I I I I I I I I I I I I I I M I i I I I I II MM I I I III II 
Db 34 6 TACATGTCACACTCGCATAAAGATTAATTATCTGCATGGTCGGTGCTGGAT 396 



RESULT 5 

AU162606 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



AU162606 649 bp mRNA EST 26-OCT-2000 

AU162606 Rice cDNA from young root Oryza sativa cDNA clone R10541, 
mRNA sequence. 
AU162606 

AU162606.1 GI:11026005 
EST. 

Oryza sativa. 
Oryza sativa 

Eukaryota; Viridiplantae ; Streptophyta ; Embryophyta; Tracheophyta ; 
Spermatophyta; Magnoliophyta ; Liliopsida; Poales; Poaceae; 
Ehrhartoideae; Oryzeae; Oryza. 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



BASE COUNT 
ORIGIN 



1 (bases 1 to 649) 

Sasaki, T. and Yamamoto, K. 

Rice cDNA from young root (2000) 

Unpublished (2000) 

Contact: Takuji Sasaki 

National Institute of Agrobiological Resources 

Rice Genome Research Program, Kannondai 2-1-2, Tsukuba, Ibaraki 
305-8602, Japan 
Tel: 81-298-38-7441 
Fax: 81-298-38-7468 

Email : tsasaki@abr.affrc.go.jp, URL:http: //rgp . dna . af f rc . go . jp/ 

PROJECT ^'RGP 1 . 

R10541_2Z. 

Location/Qualifiers 

1. .649 

/organism="Oryza sativa" 
/strain="Nipponbare" 
/db_xref="taxon:4530" 
/clone="R10541" 

/clone_lib= n Rice cDNA from young root" 
/tissue_type= l? young root" 
213 a 124 c 159 g 150 t 3 others 



Query Match 56.4%; Score 237.6; DB 10; " Length 649; 

Best Local Similarity 81.1%; Pred. No. 6.8e-50; 

Matches 287; Conservative 0; Mismatches 66; Indels 1; Gaps 1; 

Qy 42 aaaggggaggggaaagcaaagctgatggcggaggcccaggggaaagcaaagcaaatggcg 101 

I I I I I I I I I I I III I II I I I I I I I I I I I I I I I I I 

Db 18 AAAGAGAAGCGAGAGAAATCGGAGAGGAAGATGGGGGAAGAGGCGGCGAAGCAAATGGCG 7 7 

Qy 102 gaggccccgagcaagatcgaatccatgaggaagtgggtcgtcgagcacaagctccgagcc 161 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I MM III 

Db 7 8 GAANCCCCGGGCAAGATTGAATCCATGAGGAAGTGGGTCATCGACCACAA-NTCCGCGCC 136 

Qy 162 gtaggttgcctctggctaggtgggatcagcagttcgatcgcctacaactggtcgcggccc 221 

II I II I I I II I I I I II I I M I I I II I II I I I I I II I I II I I II I II I I I II I 
Db 137 GTAGGTTGCCTATGGCTTACTGGGATCAGCAGCTCGATTGCGTACAACTGGTCGAGGCCC 196 

Qy 222 aatatgaagcctagcgtcaagatcatccacgcaaggttgcatgctcaagctctaaccctg 281 

II I M II II I I II I II I II I II I I II II I I I II I I II I M I II I II I I I I II I II 
Db 197 AATATGAAGACTAGCGTCAAGATCATCCATGCAAGGTTGCATGCTCAAGCCCTAACACTA 256 

Qy 282 gctgcattagttggttctgcatgcgtggagtactatgaccagaagtatggttcttctggg 341 

II II II I II M I II I I I I I I I I II II I II I I I I I I II II I I II I I 
Db 257 GCAGCGTTAGTGGGATCTGCAATGGTAGAGTACTATGACGCGAAGTACGGCACATCTGGA 316 

Qy 34 2 ccaaaggtggacaaatacacaagccaatacctggcccattcccataaagattaa 395 

II M II I M I I I I II II I II I I I II II I II M I I I I II I I II I I I I I I I I 
Db 317 CCGAAGGTGGACAAGTACACAAGCCAATACCTGGCGCATTCACATAAAGATTAA 370 



RESULT 6 
AU056970 

LOCUS AU056970 534 bp mRNA EST 29-APR-1999 



DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



BASE COUNT 
ORIGIN 



AU056970 Oryza sativa mature leaf Nipponbare Oryza sativa cDNA 

clone S21027_1A, mRNA sequence. 

AU056970 

AU056970.1 GI:4715854 
EST. 

Oryza sativa. 
Oryza sativa 

Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta 
Spermatophyta; Magnoliophyta ; Liliopsida; Poales; Poaceae; 
Ehrhartoideae; Oryzeae; Oryza. 
1 (bases 1 to 534) 
Yamamoto, K. and Sasaki, T. 
Rice cDNA from mature leaf 
Unpublished (1999) 
Contact: Takuji Sasaki 

National Institute of Agrobiological Resources 

Rice Genome Research Program, Kannondai 2-1-2, Tsukuba, Ibaraki 
305-8602, Japan 
Tel: 81-298-38-7441 
Fax: 81-298-38-7468 

Email : tsasaki@abr . af f rc . go . jp, URL: http: //rgp . dna . af f rc . go . jp/ 
PROJECT ^'RGP* . 

Location /Qualifiers 

1. .534 

/organism= ,f Oryza sativa" 
/strain="Nipponbare" 
/db_xref="taxon: 4530" 
/clone="S21027JLA" 

/clone_lib="Oryza sativa mature leaf Nipponbare" 
/tissue_type="mature leaf" 
167 a 109 c 149 g 109 t 



Query Match 54.3%; Score 228.8; DB 10; Length 534; 

Best Local Similarity 74.5%; Pred. No. l.le-47; 

Matches 301; Conservative 0; Mismatches 102; Indels 1; Gaps 

Qy 17 aaaagaaggagacgccgaaaattcgaaaggggaggggaaagcaaagctgatggcggaggc 7 6 

III I I I I I I III I I I I I I I I I I I I I I II III 

Db 24 AAAG AAAG AAG AAAAAC AAAGC G C GT AAAG AG AAG C G AG AG AAAT C G GAG AG G AAG AT G G 83 

Qy 77 ccaggggaaagcaaagcaaatggcggaggccccgagcaagatcgaatccatgaggaagtg 136 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 84 GGGAAGAGGCGGCGAACAAATGGCGGAAGCCCCGGGCAAGATTGAATCCATGAGGAAGTG 14 3 

Qy 137 ggtcgtcgagcacaagctccgagccgtaggttgcctctggctaggtgggatcagcagttc 196 

I I I I I I II I I I I I I I I I II I I I I I I I I I M I I I I I I I I I I I I I I M I I I II 
Db 14 4 GGTCATCGACCACAAGCTCCGCGCCGTA-GTTGCCTATGGCTTACTGGGATCAGCAGCTC 202 

Qy 197 gatcgcctacaactggtcgcggcccaatatgaagcctagcgtcaagatcatccacgcaag 256 

III II I I I I I I I II I I I I I I M I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 
Db 203 GATTGCGTACAACTGGTCGAGGCCCAATATGAAGACTAGCGTCAAGATCATCCATGCAAG 262 

Qy 257 gttgcatgctcaagctctaaccctggctgcattagttggttctgcatgcgtggagtacta 316 

I I I I I I I I I I I I I I I I I I I I II II II I II I I II I I I I I I II I I I I I I I I 
Db 2 63 GTTGCATGCTCAAGCCCTAACACTAGCAGCGTTAGTGGGATCTGCAATGGTAGAGTACTA 322 



Qy 317 tgaccagaagtatggttcttctgggccaaaggtggacaaatacacaagccaatacctggc 37 6 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 323 TGACGCGAAGTACGGCACATCTGGACCGAAGTGGGACAAGTACACAAGCCAATACCTGGG 382 

Qy 37 7 ccattcccataaagattaaaggtcgccatgttggttcctgcatg 420 

I II II I I I I I I I I I I I I I I I I I I 
Db 38 3 CGCATTCACATAAAGATTAAGATCTTCATATTCGCTGTTGGATG 42 6 



RESULT 7 

AU070966 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



BASE COUNT 
ORIGIN 



AU070966 517 bp mRNA EST 10-JUN-1999 

AU070966 Rice cDNA from young root Oryza sativa cDNA clone 
R10541_1A, mRNA sequence. 
AU070966 

AU070966.1 GI:5038856 
EST . 

Oryza sativa. 
Oryza sativa 

Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta 
Spermatophyta; Magnoliophyta; Liliopsida; Poales; Poaceae; 
Ehrhartoideae; Oryzeae; Oryza. 
1 (bases 1 to 517) 
Yamamoto,K. and Sasaki, T. 
Rice cDNA from young root 
Unpublished (1999) 
Contact: Takuji Sasaki 

National Institute of Agrobiological Resources 

Rice Genome Research Program, Kannondai 2-1-2, Tsukuba, Ibaraki 
305-8602, Japan 
Tel: 81-298-38-7441 
Fax: 81-298-38-7468 

Email : tsasaki@abr . af f rc . go . jp, URL : http: //rgp.. dna . af f rc . go . jp/ 
PROJECT = f RGP ! . 

Location /Qualifiers 

1. .517 

/organism="Oryza sativa" 
/strain="Nipponbare" 
/db_xref ="taxon : 4530" 
/clone="R10541_lA" 

/clone_lib="Rice cDNA from young root" 
/tissue_type="young root" 
165 a 103 c 142 g 102 t 5 others 



Query Match 52.2%; Score 219.8; DB 10; Length 517; 

Best Local Similarity 76.8%; Pred. No. 2e-45; 

Matches 291; Conservative 0; Mismatches 86; Indels 2; Gaps 

Qy 17 aaaagaaggagacgccgaaaattcgaaaggggaggggaaagcaaagctgatggcggaggc 7 6 

II I I I I I I I III I I I I I I I I I I I I I I II III 

Db 28 AAAGAAAGAAGAAAAACAAAGCGCGTAAAGAGAAGCGAGAGAAATCGGAGAGGAAGATGG 87 

Qy 77 ccaggggaaagcaaagcaaatggcggaggccccgagcaagatcgaatccatgaggaagtg 136 

I I I I I I I I I I I II I I I I I I I II I I I I I I I I I I II I I I I I I II 



Db 



88 GGGAAGAGGCGGCGAACAAATNGCGGAAG-CCCGGGCAAGATTGAATCCATGAGGAAGTG 14 6 



Qy 137 ggtcgtcgagcacaagctccgagccgtaggttgcctctggctaggtgggatcagcagttc 196 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I II 
Db 147 GGTCATCGACCACAAGCTCCGCGCCGTA-GTTGCCTATGGCTTACTGGGATCAGCAGCTC 205 

Qy 197 gatcgcctacaactggtcgcggcccaatatgaagcctagcgtcaagatcatccacgcaag 256 

III II I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I 
Db 206 GATTGCGTACAACTGGTCGAGGCCCAATATGAAGACNAGCGTCAAGATCATCCATGCAAG 2 65 

Qy 257 gttgcatgctcaagctctaaccctggctgcattagttggttctgcatgcgtggagtacta 316 

I I I I I II I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I I 
Db 2 66 GTTGCATGCTCAAGCCCTAACACNAGCAGCGTTAGTGGGATCNGCAATGGTAGAGTACTA 325 

Qy 317 tgaccagaagtatggttcttctgggccaaaggtggacaaatacacaagccaatacctggc 376 

I I I I I I I I I I II I I I I I I II III .111111 I I I I I I I I I I I I I I I I I I I I 
Db 326 TGACGCGAAGTACGGCACATCTGGACCGAAGTGGGACAAGTACACAAGCCAATACCTGGC 385 

Qy 377 ccattcccataaagattaa 395 

I I I I I I M I I I I M I I I 
Db 386 GCATTCACATAAAGATTAA 4 04 



RESULT 8 

AW352644 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
COMMENT 



FEATURES 

source 



AW352644 574 bp mRNA EST 02-FEB-2000 

660033E05.xl 660 - Mixed stages of anther and pollen Zea mays cDNA, 
mRNA sequence. 
AW352644 

AW352644.1 GI:6851634 
EST. 

Zea mays . 
Zea mays 

Eukaryota ; Vir idiplantae ; St reptophyta ; Embryophyta ; Tracheophyta ; 
Spermatophyta; Magnoliophyta; Liliopsida; Poales; Poaceae; PACC 
clade; Panicoideae; Andropogoneae ; Zea. 
1 (bases 1 to 574) 
Walbot, V. 

Maize ESTs from various cDNA libraries sequenced at Stanford 
University 
Unpublished (1999) 
Contact: Walbot V 

Department of Biological Sciences 
Stanford University 

855 California Ave, Palo Alto, CA 94304, USA 

Tel: 650 723 2227 

Fax: 650 725 8221 

Email : walbot@stanf ord. edu 

Plate: 660033 row: E column: 05. 

Location/Qualifiers 

1. .574 

/organism="Zea mays" 
/cultivar="Ohio4 3" 
/db_xref="taxon: 4577" 

/clone_lib="660 - Mixed stages of anther and pollen" 
/tissue_type="whole premieotic anthers to pollen shed" 
/dev_stage="premieotic anthers to pollen shed" 



/lab_host="XLOLR" 

/note="Organ: anthers; Vector: Lambda Zap; Site_l: EcoRI; 

Site_2: Xhol; Anther and pollen cDNA library. 

Directionally sequenced with 5' end at the EcoRI site. 

Created by Amie Franklin." 
BASE COUNT 173 a 132 c 151 g 118 t 

ORIGIN 



Query Match 51.6%; Score 217.2; DB 10; Length 574;. 

Best Local Similarity 92.7%; Pred. No. 9.3e-45; 

Matches 228; Conservative 0; Mismatches 18; Indels 0; Gaps 0; 

Qy 68 ggcggaggcccaggggaaagcaaagcaaatggcggaggccccgagcaagatcgaatccat 127 

II I I I I II II I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I 
Db 32 9 GGAAGAGGAGAAGAGGGGAAAGAAGCAAATGGCGGAGTCCCCGAGCAAGATCGAATCCAT 388 

Qy 128 gaggaagtgggtcgtcgagcacaagctccgagccgtaggttgcctctggctaggtgggat 187 

I I I I I I I I I I I I I I I ! I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 38 9 GAGGAAGTGGGTCGTCGAGCACAAGCTCCGAGCCGTAGGTTGCCTCTGGCTAGGTGGGAT 44 8 

Qy 188 cagcagttcgatcgcctacaactggtcgcggcccaatatgaagcctagcgtcaagatcat 24 7 

I II I I I I I I I I I I I I M I II I I I I I 1 M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 44 9 CAGCAGTTCGATCGCATACAACTGGTCGCGGCCCAATATGAAGACTAGCGTCAAGATCAT 508 

Qy 24 8 ccacgcaaggttgcatgctcaagctctaaccctggctgcattagttggttctgcatgcgt 307 

I I I I II I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 50 9 CCACGCAGGGTTGCATGCGCAGGCTCTAACCCTAGCTGCATTAGTTGGTTCTGCATGCGT 568 

Qy 308 ggagta 313 

I I I I I I 
Db 569 GGAGTA 57 4 



RESULT 9 

D49054 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



D49054 464 bp mRNA EST 02-AUG-1995 

RICS15677A Rice green shoot Oryza sativa cDNA, mRNA sequence. 
D49054 

D49054.1 GI:702763 
EST. 

Oryza sativa. 
Oryza sativa 

Eukaryota; Viridiplantae ; Streptophyta; Embryophyta; Tracheophyta ; 
Spermatophyta; Magnoliophyta; Liliopsida; Poales; Poaceae; 
Ehrhartoideae; Oryzeae; Oryza. 
1 (bases 1 to 464) 

Sasaki, T., Miyao,A. and Yamamoto,K. 
Rice cDNA from callus 1995 
Unpublished (1995) 
Contact: Takuji Sasaki 

National Institute of Agrobiological Resources 

Rice Genome Research Program, Kannondai 2-1-2, Tsukuba, Ibaraki 
305-8602, Japan 
Tel: 81-298-38-7441 
Fax: 81-298-38-7468 

Email: tsasaki@abr.affrc.go.jp, URL:http: //rgp . dna . af f rc . go . jp/ . 



FEATURES 

source 



BASE COUNT 
ORIGIN 



Location/Qualifiers 
1. .464 

/organism="Oryza sativa" 
/strain="Nipponbare" 
/db_xref="taxon:4530" 
/clone_lib= fl Rice green shoot" 
/note="Green shoot (8 days old) 
146 a 90 c 136 g 87 t 



5 others 



Query Match 50.4%; Score 212.2; DB 11; Length 464; 

Best Local Similarity 73.1%; Pred. No. 1.6e-43; 

Matches 296; Conservative 0; Mismatches 107; Indels 2; Gaps 2; 

Qy 17 aaaagaaggagacgccgaaaattcgaaaggggaggggaaagcaaagctgatggcggaggc 76 

III I I I I I I III II I I I I I I I I I I I I II III 

Db 32 AAAGAAAGAAGAAAAACAAAGCGCGTAAAGAGAAGCGAGAGAAATCGGAGAGGAAGATGG 91 

Qy 7 7 ccaggggaaagcaaagcaaatggcggaggccccgagcaagatcgaatccatgaggaagtg 136 

I I I II I I I I I I I I I II II I I I I I I I I I I I I I I I I I 
Db 92 GGGAAGAGGCGGCGAANAANATGGCGGAAGCCCGGGCAAGATTGAATCCATGAGGAAGTG 151 

Qy 137 ggtcgtcgagcacaagctccgagccgtaggttgcctctggctaggtgggatcagcagttc 196 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I II 
Db 152 GGTCATCGACCACAAGCTCCGCGCCGTA-GTTGCCTATGGCTTACTGGGATCAGCAGCTC 210 

Qy 197 gatcgcctacaactggtcgcggcccaatatgaagcctagcgtcaagatcatccacgcaag 256 

III II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I 
Db 211 GATTGCGTACAACTGGTCGAGGCCCAATATGAAGACTAGCGTCAAGATCATCCATGCAAG 27 0 

Qy 257 gttgcatgctcaagctctaaccctggctgcattagttggttctgcatgcgtggagtacta 316 

I I I I I I I I I I I I I I I I I II I II II II I I I I I II I I I I I I II I I II I I I I 
Db 271 GTTGCATGCTCAAGCCCTAACACTAGCAGCGTTAGTGGGATCTGCAATGGTAGAGTACTA 330 

Qy 317 tgaccagaagta-tggttcttctgggccaaaggtggacaaatacacaagccaatacctgg 375 

I I I I I I I II I II I I I I I I II I II I 1 I I I I I I I I I I I M I I I I II I I I I 
Db 331 TGACGCGAAGTACGGGCACATCTGGACCNAAGTGGGACAAGTACACAAGCCAATACCTGG 390 

Qy 376 cccattcccataaagattaaaggtcgccatgttggttcctgcatg 420 

I I I I I I II I I I I I I I I II I I I I I II II 

Db 391 NGCATTCACATTAAGGTTAAGGTCTTTCATATTCGCTGTTGGGTG 4 35 



RESULT 10 

D22742 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



D22742 431 bp mRNA EST 08-JUL-1999 

RICC1155A Rice callus Oryza sativa cDNA clone C1155_1A, mRNA 
sequence . 
D22742 

D22742.1 GI:431806 
EST. 

Oryza sativa. 
Oryza sativa 

Eukaryota; Viridiplantae ; Streptophyta; Embryophyta; Tracheophyta; 
Spermatophyta; Magnoliophyta; Liliopsida; Poales; Poaceae; 
Ehrhartoideae; Oryzeae; Oryza. 



REFERENCE 1 (bases 1 to 431) 

AUTHORS Sasaki, T. and Minobe, Y . 

TITLE Rice cDNA from callus 

JOURNAL Unpublished (1994) 
COMMENT Contact: Takuji Sasaki 

National Institute of Agrobiological Resources 

Rice Genome Research Program, Kannondai 2-1-2, Tsukuba, Ibaraki 
305-8602, Japan 
Tel: 81-298-38-7441 
Fax: 81-298-38-7468 

Email : tsasaki@abr . af f rc . go . jp, URL:http: //rgp. dna . af f rc . go. jp/ 
PROJECT ='RGP' . 
FEATURES Location/Qualifiers 
source 1. .431 

/organism="Oryza sativa" 

/strain="cultivar Nipponbare, sub_species Japonica" 
/db_xref="taxon:4 530" 
/clone="C1155_lA" 
/clone_lib="Rice callus" 

/note="Vector : pBluescript II SK+; Site_l : Sail; Site_2 
Not I; cDNA prepared from rice callus mRNAs by using 
oligo(dT) as a primer and ligating to the Sall-NotI sit 
of pBluescript II SK+ phagemid. " 

BASE COUNT 144 a 86 c 128 g 73 t 

ORIGIN 



Query Match 50.1%; Score 211; DB 11; Length 431; 

Best Local Similarity 75.2%; Pred. No. 3.3e-43; 

Matches 318; Conservative 0; Mismatches 95; Indels 10; Gaps 

Qy 2 aaaaaaataactcggaaaagaaggagacgccgaaaattcgaaaggggaggggaaagcaaa 61 

I I I I M I I I I I I I II I I II Mill I III I I I I I I 
Db 9 AAGAAAAGAGCTCGAAAAAAAAAAGAAAGAAGAAAAACAAAGCGCGTAAAGAGAAGCGAG 68 

Qy 62 gctgatggcggaggcccaggggaaa gcaaagcaaatggcggaggccccgagca 114 

III I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I 

Db 69 AGAAATCGGAGAGGAAGATGGGGGAAGAGGCGGCGAAGCAAATGGCGGAAG-CCCGGGCA 127 

Qy 115 agatcgaatccatgaggaagtgggtcgtcgagcacaagctccgagccgtaggttgcctct 17 4 

I I I I II I I I M I II I I I I M M I I I MM II I I M I I I I I M I I I I I M I I I I I 
Db 128 AGATTGAATCCATGAGGAAGTGGGTCATCGACCACAAGCTCCGCGCCGTA-GTTGCCTAT 18 6 

Qy 175 ggctaggtgggatcagcagttcgatcgcctacaactggtcgcggcccaatatgaagccta 234 

I II I I M II I I I I II I II II I II II I I I M I I M I M II M I I I I II II I II 
Db 187 GGCTTACTGGGATCAGCAGCTCGATTGCGTACAACTGGTCGAGGCCCAATATGAAGACTA 24 6 

Qy 235 gcgtcaagatcatccacgcaaggttgcatgctcaagctctaaccctggctgcattagt-t 293 

I II I I I II I I II I I I I I I I I I I I II I I I I I I I I II I I I I I I II II II I I II I 

Db 247 GCGTCAAGATCATCCATGCAAGGTTGCATGCTCAAGCCCTAACACTAGCAGCGTTAGTGG 306 

Qy 294 ggttctgcatgcgtggagtactatgaccagaagtatggttcttctgggccaaaggtggac 353 

II I II I M II II I I II I II I I I II II I I I I I I I M I I I I I II II 

Db 307 GGATCTGCAATGGTAGAGTACTATGACGCGAAGTACGGCACATCTGGACCGAGGTGGGAC 366 

Qy 354 aaatacacaagccaatacctggcccattcccataaagattaaaggtcgccatgttggttc 413 
M II I I II I II I I I I I I II I I I I I I I I I I I II I I I II I I I I I I I 



Db 367 AAGTACACAAGCCAATACCTGGCGCATTCACATAAGGATTAAGATCTTCATATTCGGTTG 426 



Qy 414 ctg 416 
I I 

Db 427 TTG 429 



RESULT 11 

C97246 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



BASE COUNT 
ORIGIN 



C97246 422 bp mRNA EST 19-OCT-1998 

C97246 Rice callus Oryza sativa cDNA clone C53369_1A, mRNA 
sequence . 
C97246 

C97246.1 GI:3759988 
EST. 

Oryza sativa. 
Oryza sativa 

Eukaryota; Viridiplantae; Streptophyta ; Embryophyta; Tracheophyt 
Spermatophyta ; Magnoliophyta; Liliopsida; Poales; Poaceae; 
Ehrhartoideae ; Oryzeae; Oryza. 
1 (bases 1 to 422) 
Sasaki, T. and Minobe,Y. 
Rice cDNA from callus 
Unpublished (1994) 
Contact: Takuji Sasaki 

National Institute of Agrobiological Resources 

Rice Genome Research Program, Kannondai 2-1-2, Tsukuba, Ibaraki 
305-8602, Japan 
Tel: 81-298-38-7441 
Fax: 81-298-38-7468 

Email : tsasaki@abr . af f rc . go . jp, URL : http : //rgp . dna . af f rc . go . jp/ 
PROJECT = 1 RGP 1 . 

Location/Qualifiers 

1. .422 

/organism="Oryza sativa" 

/strain="cultivar Nipponbare, sub_species Japonica" 
/db_xref="taxon: 4530" 
/clone="C53369_lA" 
/clone_lib="Rice callus" 

/note="Vector : pBluescript II SK+; Site_l : Sail; Site_2 
Not I; cDNA prepared from rice callus mRNAs by using 
oligo(dT) as a primer and ligating to the Sall-NotI sit 
of pBluescript II SK+ phagemid. " 
141 a 87 c 116 g 75 t 3 others 



Query Match 45.9%; Score 193.4; DB 11; Length 422; 

Best Local Similarity 74.4%; Pred, No. 9e-39; 

Matches 296; Conservative 0; Mismatches 94; Indels 8; Gaps 

Qy 5 aaaataactcggaaaagaaggagacgccgaaaattcgaaaggggaggggaaagcaaagct 64 

I I I I I I I I I I I I I I I I I I I I I I I III I I I M I 
Db 2 AAAAGAGCTCGAAAAAAAAAAGAAAGAAGAAAAACAAAGCGCGTAAAGAGAAGCGAGAGA 61 

Qy 65 gatggcggaggcccagggg aaagcaaagcaaatggcggaggccccgagcaagatcg 120 

III I I I I I I I I III I II I I III I I I I I I I I 



1 



OH 


0 c. 


flnTrrran:a(^na apatpppppa apapppppppapa a a tppppp a appppnppp a apattp 


121 


Qy 


121 


aatccatgaggaagtgggtcgtcgagcacaagctccgagccgtaggttgcctctggctag 


180 






1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 




UD 


1 99 

JL ^ 


A ATPP A TP APP A APTPPPTP ATPPAPPAP A &ar T VCCCCClCCCl r Y A-PTTPPPT ATPPTNTA 


180 


Qy 


181 


gtgggatcagcagttcgatcgcctacaactggtcgcggcccaatatgaagcctagcgtca 


240 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




UD 


1 R1 


PTPPPATP APP APPTPPATTPPPT AP A APTPPTPPAPPPPP A AT ATPAAPAPTAPPPTPA 


240 


Qy 


241 


agatcatccacgcaaggttgcatgctcaagctctaaccctggctgcattagt-tggttct 


299 






1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II II II 1 1 1 1 1 II III 




UD 


Zsl 


AP A TP ATPP A TPP A APPTTPP ATPPTP A APPPPT A AP APT APP APPPTT APTPPPPATPT 


300 


Qy 


JUU 


gcatgcgtggagt actatgaccagaagtatggtt cttct gggccaaaggtggac aaat 


O D I 






III 1 1 1 ! 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 III 




Db 


301 


GCAATGGTAGAGTACTATGACGCGAAGTACGGCACATCTTGACCCAAGTNGGACCAAAGT 


360 


Qy 


358 


acacaagccaatacctggcccattcccataaagattaa 395 








1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


361 


ACACAAGCCAATACCTGGCGCATTCACATAAAGATTAA 398 





RESULT 12 

BF317828 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

TITLE 
JOURNAL 
COMMENT 



FEATURES 

source 



BF317828 231 bp mRNA EST 21-NOV-2000 

OV1_10_B02 . gl_A002 Ovary 1 (0V1) Sorghum bicolor cDNA, mRNA 
sequence . 
BF317828 

BF317828.1 GI:11266334 
EST. 

sorghum. 
Sorghum bicolor 

Eukaryota; Viridiplantae ; Streptophyta; Embryophyta; Tracheophyta; 
Spermatophyta; Magnoliophyta; Liliopsida; Poales; Poaceae; PACC 
clade; Panicoideae ; Andropogoneae; Sorghum. 
1 {bases 1 to 231) 

Cordonnier-Pratt,M.-M. , Gingle,A., Marsala, C, Sudman,M. and Pratt 
,L.H. 

An EST database from Sorghum: ovaries of varying immature stages 

Unpublished (2000) 

Contact: Cordonnier-Pratt MM 

Department of Botany 

The University of Georgia 

Plant Sciences Building, Rm. 2502, Athens, GA 30602-7271, USA 
Tel: -706 542 1860 
Fax: 706 542 1805 
Email: mmpratt@uga.edu 

Sequences have been trimmed to exclude PolyA, vector and regions 
below Phred quality 16. The threshold for highest quality sequence 
is 20. 

Seq primer: PolyTMix 
High quality sequence start: 69 
High quality sequence stop: 200 
POLYA=No . 

Location/Qualifiers 
1. .231 



/organism="Sorghum bicolor" 
/db__xref="taxon: 4558" 
/clone_lib="Ovary 1 (0V1) " 

/note="Organ: Mix of ovaries of varying immature stages 
from 8-week-old plants; Vector: pBluescript II from Lambda 
Zap II; Site_l: Xhol; Site_2 : EcoRI; The library was made 
from poly-A RNA in the cloning vector lambda ZAP II. 
Clones to be sequenced were prepared by mass excision." 

BASE COUNT 65 a ■ 52 c 54 g 60 t 

ORIGIN- 



Query Match 39.8%; Score 167.6; DB 11; Length 231; 

Best Local Similarity 94.8%; Pred. No. 2.7e-32; 

Matches 184; Conservative 0; Mismatches 9; Indels 1; Gaps 1; 

Qy 225 atgaagcctagcgtcaagatcatccacgcaaggttgcatgctcaagctctaaccctggct 284 

I I I I I II. I I I I I I I I I I I I I I I M I I I M I M I I I I I I I I I I II I I I I I I I I I I III 
Db 1 ATGAAGCCTAGCGTCAAGATCATCCACGCAAGGTTGCATGCACAAGGTCTAACCCTAGCT 60 

Qy 285 gcattagttggttctgcatgcgtggagtactatgaccagaagtatggttcttctgggcca 344 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 GCATTAGTTGGTTCTGCATGCGTGGAGTACTATGACAATAAGTATGGTTCTTCTGGGCCA 120 

Qy 34 5 aaggtggacaaatacacaagccaatacctggcccattcccataaagattaaaggtcgcca 4 04 

I I I I I II II M I I I I I I I I I I II I I I I I I I I I I I I I I I I I I M I I I I I II I I III 
Db 121 AAGGTGGACAAATACACAAGCCAATACCTGGCCCATGCGCATAAAGATT-AAGATCTCCA 17 9 

Qy 405 tgttggttcctgca 418 

I I I I I I I II 1 I I I I 
Db 180 TGTTGGTTCCTGCA 193 



RESULT 13 
AI967165/C 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
COMMENT 



AI967165 615 bp mRNA EST 23-AUG-1999 

614006G06.x3 614 - root cDNA library from Walbot Lab Zea mays cDNA, 
mRNA sequence. 
AI967165 

AI967165.1 GI:5762117 
EST. 

Zea mays . 
Zea mays 

Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta ; 
Spermatophyta; Magnoliophyta; Liliopsida; Poales; Poaceae; PACC 
clade; Panicoideae; Andropogoneae; Zea. 
1 (bases 1 to 615) 
Walbot, V. 

Maize ESTs from various cDNA libraries sequenced at Stanford 
University 
Unpublished (1999) 
Contact: Walbot V 

Department of Biological Sciences 
Stanford University 

855 California Ave, Palo Alto, CA 94304, USA 
Tel: 650 723 2227 
Fax: 650 725 8221 



FEATURES 

source 



BASE COUNT 
ORIGIN 



Email: walbot@stanford.edu 

Plate: 614006 row: G column: 06. 

Location /Qualifiers 

1. .615 

/organism="Zea mays" 
/cultivar="W23" 
/db_xref ="taxon : 4 577 " 

/clone_lib="614 - root cDNA library from Walbot Lab" 
/tissue_type="root " 
/dev_stage="3-4 days old" 
/lab_host="XLOLR" 

/note="0rgan: root; Vector: pBlueScriptll SK+; Site_l: 
EcoRI; Site_2: Xhol; 3-4 days old root tissue from Walbot 
Lab (LM) " 

210 a 133 c 117 g 154 t 1 others 



Query Match 39.3%; 
Best Local Similarity 99.4%; 
Matches 166; Conservative 



Score 165.4; DB 10; 
Pred. No. l.le-31; 
0; Mismatches 1; 



Length 615; 
Indels 0; 



Gaps 



0; 



Qy 255 aggttgcatgctcaagctctaaccctggctgcattagttggttctgcatgcgtggagtac 314 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 11 I I I I I I I I I I I I I I I I I I I I I I 
Db 4 91 AGGTTGCATGCTCAAGCTCTAACCCTGGCTGCATTAGTTGGTTCTGCATGCGTGGAGTAC 4 32 



Qy 315 tatgaccagaagtatggttcttctgggccaaaggtggacaaatacacaagccaatacctg 374 

I I I I I I I I I I I I I I I I I I I I I I II I I I I. I I I I I I I I 1 I I I I I I I I I I I I I I I I I II I I I I 
Db 4 31 TATGACCAGAAGTATGGTTCTTCTGGGCCAAAGGTGGACAAATACACAAGCCAATACCTG 37 2 

Qy 375 gcccattcccataaagattaaaggtcgccatgttggttcctgcatgc 421 

I I I I I I I I I I I I I I I I I I I II II I I I I I I M I I I I I I I I II I I I I I 
Db 371 GCCCATTCGCATAAAGATTAAAGGTCGCCATGTTGGTTCCTGCATGC 325 



RESULT 14 

BF203045 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
COMMENT 



BF203045 242 bp mRNA EST 06-NOV-2000 

WHE1768_G09_M18ZS Wheat pre-anthesis spike cDNA library Triticum 
aestivum cDNA clone WHE17 68_G09_M18 , mRNA sequence. 
BF203045 

BF203045.1 GI:11117799 
EST. 

bread wheat . 
Triticum aestivum 

Eukaryota; Viridiplantae ; Streptophyta ; Embryophyta; Tracheophyta ; 
Spermatophyta; Magnoliophyta; Liliopsida; Poales; Poaceae; Pooideae 
; Triticeae; Triticum. 
1 (bases 1 to 242) 

Anderson, 0. D. , Chao,S., Choi,D.W., Close, T. J., Fenton,R.D., Han 
,P.S., Hsia,C.C, Kang,Y., Lazo,G.R., Miller, R., Rausch,C.J., 
Seaton,C.L. and Tong,J.C. 

The structure and function of the expressed portion of the wheat 
genomes - Pre-anthesis spike cDNA library 
Unpublished (2000) 
Contact: Olin Anderson 

US Department of Agriculture, Agriculture Research Service, Pacific 



sequence and low 
20 



FEATURES 

source 



BASE COUNT 
ORIGIN 



West Area, Western Regional Research Center 
800 Buchanan Street, Albany, CA 94710, USA 
Tel: 5105595773 
Fax: 5105595818 
Email: oandersn@pw.usda.gov 
Sequence have been trimmed to remove vector 
quality sequence with phred score less than 
Seq primer: Stratagene SK primer. 

Location/Qualif iers 

1. .242 

/organism="Triticum aestivum" 
/cultivar="Chinese Spring" 
/db_xref="taxon:4565" 
/clone="WHE17 68_G09_M18" 

/clone_lib="Wheat pre-anthesis spike cDNA library" 
/tissue_type="Spike before anthesis" 
/dev_stage="Adult plant" 
/lab_host="E. coli SOLR" 

/note="Vector : Lambda Uni-ZAP XR, excised phagemid; 
Site_l: EcoRI; Site_2: Xhol; Plants were grown 'in the 
greenhouse. Whole spike with awns trimmed, white, green 
and yellow anther were collected and total RNA, and 
poly (A) RNA were prepared, a cDNA library was made, and 
the cDNA clones were in vivo excised to give pBluescript 
phagemids in the TJ Close lab (Choi, Close, Fenton) at 
the University of California, Riverside. Plasmid DNA 
preparations and DNA sequencing were performed in the 0D 
Anderson lab (all other authors)." 
60 a 72 c 70 g 39 t 1 others 



Query Match 39.2%; Score 165.2; DB 11; Length 242; 

Best Local Similarity 88.2%; Pred. No. l.le-31; 

Matches 179; Conservative 0; Mismatches 24; Indels 0; Gaps 



Qy 



Db 



40 



gcaaagcaaatggcggaggccccgagcaagatcgaatccatgaggaagtgggtcgtcgag 14 6 

II I II I I I I I I I I I I I I I I I I I I I I I I M I II I I I I I I i I I I I I I M I I I I I I I I I 
GCGAAGCAAATGGCGGAGGCCCCGAGCAAGATCGAATCCATGCGGAAGTGGGTGGTCGAT 9 9 



Qy 
Db 

Qy 

Db 



147 cacaagctccgagccgtaggttgcctctggctaggtgggatcagcagttcgatcgcctac 20 6 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I MINI I II II Mill Ml 
100 CACAAGCTCCGAGCCGTAGGTTGCCTGTGGCTTANCGGGATCTCCAGCTCCATCGCGTAC 159 



207 



160 



aactggtcgcggcccaatatgaagcctagcgtcaagatcatccacgcaaggttgcatgct 

I II I II I I I I I II II II I I I II I I I I I I I I I I I I II II I I II I II I I I I I I I II 
AACTGGTCGCGGCCCAACATGAAGACCAGCGTCAAGCTCATCCACGCAAGGTTGCACGCG 



266 



219 



Qy 267 caagctctaaccctggctgcatt 289 

II I I II II I II I I I II I M 
Db 220 CAAGCTCTAACGATCGCTGCCTT 242 



RESULT 15 
C97312 

LOCUS C97312 352 bp mRNA EST 19-OCT-1998 

DEFINITION C97312 Rice callus Oryza sativa cDNA clone C53986_1A, mRNA 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

' TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



BASE COUNT 
ORIGIN 



sequence . 
C97312 

C97312.1 GI:3760054 
EST. 

Oryza sativa. 
Oryza sativa 

Eukaryota; Viriciiplantae ; Streptophyta ; Embryophyta ; Tracheophyt 
Spermatophyta; Magnoliophyta; Liliopsida; Poales; Poaceae; 
Ehrhartoideae; Oryzeae; Oryza. 
1 (bases 1 to 352) 
Sasaki, T. and Minobe,Y. 
Rice cDNA from callus 
Unpublished (1994) 
Contact: Takuji Sasaki 

National Institute of Agrobiological Resources 

Rice Genome Research Program, Kannondai 2-1-2, Tsukuba, Ibaraki 
305-8602, Japan 
Tel: 81-298-38-7441 
Fax: 81-298-38-7468 

Email : tsasaki@abr . af f rc . go. jp, URL:http: //rgp. dna . af f rc. go. jp/ 
PROJECT = 1 RGP 1 . 

Location/Qualifiers 

1. .352 

/organism="Oryza sativa" 

/strain= n cultivar Nipponbare, sub_species Japonica" 

/db_xref="taxon:4530" 

/clone="C53986_lA" 

/clone_lib="Rice callus" 

/note="Vector : pBluescript II SK+; Site_l : Sail; Site_2 
Notl; cDNA prepared from rice callus mRNAs by using 
oligo(dT) as a primer and ligating to the Sall-NotI sit 
of pBluescript II SK+ phagemid. " 
78 a 88 c 120 g 65 t 1 others 



Query Match 38.7%; 
Best Local Similarity 75.7%; 
202; Conservative 



Matches 


Qy " 


68 


Db 


43 


Qy 


128 


Db 


103 


Qy 


188 


Db 


163 


Qy 


248 


Db 


223 



Score 163; DB 11; 
Pred. No. 4.1e-31; 
0; Mismatches 65; 



Length 352; 



Indels 



0; Gaps 



I I i 



I I I II I I I I I II I 



I I I I I I I I I I I I I 



III I I I I I I I I I I I I I I I I I I I I I II I I I I I I II 



II I I I I I 



I I I I I 



cagcagttcgatcgcctacaactggtcgcggcccaatatgaagcctagcgtcaagatcat 

II I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I 
GGCGAGCTCGATCGCGTACAACTGGTCGAGGCCCGGCATGAAGACCAGCGTCAAGATCAT 

ccacgcaaggttgcatgctcaagctctaaccctggctgcattagttggttctgcatgcgt 

I I I I I I I I I I I I II I I I I I I II II II I I I I II I I I I I I I I I I I I I I I II 
CCACGCCAGGTTGCATGCTCAGGCCCTCACACTGGCTGCATTGGCTGGCTCTGCACTGGT 



247 



222 



307 



282 



Qy 



308 ggagtactatgaccagaagtatggttc 334 



u 



I I I I I I I I I Mill II I I I I I 
Db 283 GGAGTACTACGACCATCGGTCAGGTTC 309 



Search completed: February 7, 2002, 08:20:37 
Job time: 18114 sec 



